Tag Archives: digital preservation

Advance Praise for The Theory and Craft of Digital Preservation

Screenshot of the page for the book on the Johns Hopkins University Press site.

I’m in the hope stretch for the book to come out! It’s got a cover and it’s up on the Johns Hopkins University Press site. You can even pre-order it today, and it should get to you some time in November.

All told, this is bringing to fruition a project that I started back in 2016, It’s been a long road, but I’ve really loved it. The book largely brings together things I’ve learned in dialog with the digital preservation community and I can also say that the process of writing the book in the open felt like a genuine continuation of that learning through dialog process.

It has been really neat to see the book blurbs starting to roll in. I’m floored by the very kind and thoughtful comments I’ve received from people whose work I deeply respect and admire. To that end, below is advanced praise that the press has received for the book.

“Part of a long-standing and worldwide tradition of memory keepers” – David Ferriero

“Acknowledging that we are part of ‘a long-standing and worldwide tradition of memory keepers,’ Trevor Owens challenges us to use the lessons learned in traditional preservation as we approach digital preservation. Distinguishing digital preservation as craft rather than science, Owens provides reassurance to all of us worried about finding the ‘silver bullet.’ It does not and should not exist!” — David S. Ferriero, Archivist of the United States, National Archives and Records Administration

“An indispensable handbook” – Matt Kirschenbaum

“An indispensable handbook that will be kept close at hand—used, reached for, and above all really read by those seeking a conceptual framework through which to understand the practicalities of grappling with the complex new reality of digital objects. Opening up the most theoretically sophisticated body of research in digital platforms to an entirely new audience while simultaneously equipping that audience with the conceptual background they need to function as experts in today’s information environment, Owens’s book is a practical, even-handed, and clear-eyed walkthrough of day-to-day situations. I expect it will be widely adopted in library and information science courses.” — Matthew G. Kirschenbaum, University of Maryland, College Park, author of Track Changes: A Literary History of Word Processing

“His axioms for digital preservation will guide novices and experts alike.” – Deanna Marcum

“Digital preservation, unlike the one-time process for preserving print, is an ongoing, changing responsibility for those who bear the responsibility of preserving our history and cultural heritage. Trevor Owens, a leader in the field, uses his experience and deep knowledge to show how the tools of the futurist can document the past. His axioms for digital preservation will guide novices and experts alike.” Deanna Marcum, Ithaka S+R

“An ideal text for anyone interested in archives in the digital era” – Steven Lubar

“A superb introduction to both the why and how of preserving digital cultural heritage. The Theory and Craft of Digital Preservation highlights history and theory, explains technology, and then moves on to practice, offering clear advice backed by examples. This is an ideal text for anyone interested in archives in the digital era.” — Steven Lubar, Brown University, author of Inside the Lost Museum: Curating, Past and Present

“At once historical synthesis, practical guide, and philosophical overview” – Alan Liu

“Owens blends the perspectives of archivist and media archaeologist to provide a richly satisfying appraisal—at once historical synthesis, practical guide, and philosophical overview—of what digital preservation can be. Its standout feature is a wise, practical approach for guiding even the smallest institutions in using technology for the ‘craft’ of preservation.” Alan Liu, The University of California, Santa Barbara, author of The Laws of Cool: Knowledge Work and the Culture of Information

Student Digital Preservation Consultants Looking for Small Cultural Heritage Organizations

WhatIsDP_DigitalPreservation

For many, this is where we find ourselves in organizations just starting to work on digital preservation.

I’m revising  my digital preservation graduate seminar for the University of Maryland’s iSchool for this coming fall.

I am a firm believer in learning-by-doing. I also think talking about digital preservation in the abstract, outside the very real resource and time constraints of organizations largely misses the point. So, as I did when I taught the course two years ago,  I am planning to have each student work through a series of assignments where they serve as digital preservation consultants to small cultural heritage organizations.

My intention in this approach is to offer both a meaningful learning opportunity for the students, as well as a way for them to start building out a portfolio of work that will be relevant to potential future employers. Based on how this worked last time, I am also optimistic that this can be a way to provide some help to small cultural heritage organizations that could  benefit from learning together with students in thinking through and developing plans  to make the best use of resources to make their digital content more long-lived.

For context on the potential value of this work to an organization, consider this reflection from a preservation specialist at a state cultural heritage institution who worked with one of my students last time I taught this course.

Because of our participation in this course, we have concrete steps forward as we work to develop guidelines and implement good digital preservation practice. The open nature of your course and associated materials has allowed our staff to develop their own subject knowledge and continue research.

With that context, I’m happy to offer some more information about how you (or others you know) can reach out about having a grad student from the course work with you. While students are in the DC Metro area, the assignments can all be completed remotely, so your organization need not be located in the metro area.

For a sense of the range of organizations that this might be relevant for, last time students worked with organizations including; The DC Punk Archive, Milwaukee Public Library,  Litchfield Historical Society, Laurel Historical Society, Bostwick House, Maryland Public Television, 18th Street Singers’ Digital Collection, North Dakota State Library, and the Virginia Department of Historic Resources Archives and Library.

Requesting a Graduate Student Digital Preservation Consultant

I think the finish line for digital preservation is a little too close to the starting line here. But it get's at the idea :)

I think the finish line for digital preservation is a little too close to the starting line here. But it get’s at the idea 🙂

If you (and your organization) would be interested in having a University of Maryland graduate student in my digital preservation seminar focus their digital preservation consultant project on your organization please take a two minutes to fill in this 5 question form. I think this is a great opportunity for organizations for a few different reasons.

Here are some reasons to consider filling in the form for your organization. This project is a chance to:

  1. Solicit assistance thinking through digital preservation issues and planning for your organization.
  2. Provide a meaningful learning experience to someone just getting started in the field
  3. Learn more about digital preservation as the student shares what they are learning through the class

Through the course of the assignments, students will;

  1. Document and review current practices with an organization’s digital content
  2. Draft suggestions for potential next steps to improve management of digital content grounded in the resources an organization has access too
  3. Draft a digital preservation policy for consideration for the organization

On the first day of class (August 30th), I will present the organizations that have filled out the survey my students. In the first few weeks of class I will help to pair each student with an organization for the semester.

If you are matched up with a student, the idea would be that you would commit to doing an interview or two with them about your organization’s collection and current practices for digital material and that you would review and provide input on several of their assignments (listed below).

I should underscore that it is completely fine for organizations to be literally at square one in terms of digital preservation practices and planning. So many cultural heritage organizations are just getting started with their digital preservation planning, and while it can be a bit intimidating to take some first steps in this space. There are many simple and inexpensive things organizations can be doing to mitigate risks of loss . The assignment will be most valuable for both students and organizations in cases where there is little current work  being done in digital preservation. As part of this project, students will be blogging about their work, so you and your organization will need to be OK with them sharing information about the project. This can be a bit intimidating, but by having students work on their public writing skills and inviting a broader audience into discussion about how to do this work in organizations it will help to ensure that the quality of that work is stronger and more useful. Through this public writing process, the results of the work will be more useful to both the student and to your organization.

What follows are details about the design of this assignment.

Digital Preservation Consultant Project

Here you can see a student, working synthesizing what they have found and drafting a plan.

Here you can see a student, working synthesizing what they have found and drafting a plan.

An academic understanding of the issues in digital preservation is necessary but not sufficient for  professional digital preservation work. Digital preservation is fundamentally about making the best use of what are always limited resources to best support the mission of an organization. As such, to really learn how to do digital preservation you need to apply these concepts in the practical realities of an organizational context.

Aside from participating in discussion of the course readings through the course blog, the other course assignments will require you to act as a digital preservation consultant for a cultural heritage organization. For a variety of reasons I suggest this be a small institution. Below are the five assignments you must complete over the course of the semester as part of this project.

  1. Identify Small Cultural Heritage Organization and Establish Partnership (by week 4): For most of the course assignments, you will need to find a small cultural heritage organization that you can work with as a digital preservation consultant. I have identified a list of organizations that are up for participating, but you are free to find other organizations as well. The key requirements here are that 1) they have consented to working with you 2) they have some set of digital content but 3)  their collections are not so complex that you couldn’t possibly do the project. Example institutions include an independent organization (like a house museum, a community archive or library), a small department or subset of an institution (say the archives of a student newspaper or radio station, the special collections department at a public library, or the archives in a museum).
    1. Deliverable: The output of this phase is to identify this organization and confirm that you have a commitment from them to participate. We will check in on this in class as we go, but by the date of this assignment you need to have confirmed participation of an organization that meets these requirements and have posted what organization you are working on in a list on the course website. On the site, post the name of the organization, your name (or handle) and two or three sentences about the organization and its digital content.
  2. Institutional Digital Preservation Survey (Draft by week 6 and send to your org, publish with their comments incorporated by week 8): For your organization, interview one or two staff members to get a handle on their digital collections and practices. Draw from the NSDA levels of preservation as an overall framework for conducting your survey. You will want to focus on gathering information about their practices in five key areas.
    1. First, what is the scope of their digital holdings?
    2. Second, how is that digital content currently being managed?
    3. Third, what are the staff at the organization’s perceptions of the state of their digital content (are they concerned about it, do they see it as mission critical or a nice to have, what do they see as their own self efficacy and their organization’s capacity for sustaining their content)?
    4. Forth, what kinds of digital content would the organization like to be collecting but currently isn’t?
    5. Fifth, what, if any resources, do they have that they could bring to bear on this problem (if they have some significant potential resources that’s great, but realize that there may well be very meaningful smaller resources that could be brought to bear. For example, could one staff member spend 2-4 hrs a week on digital preservation, could they bring in community volunteers, how much could they spend on things like extra hard drives etc.)  Throughout all of this, it will be important to understand what the organization’s collecting mission is. You want to begin to probe all the questions above, but you need to be able to map their answers to the NDSA levels.
    6. Deliverable: You will write and publish a post to the course blog (1200-3000 words) in which you present the findings of your survey. The post should first provide context, what is this organization what are its digital holdings what does it want to be collecting them. From there, work through presenting an accurate and coherent report of the themes and issues that came through in your interviews. At this point you are primarily interested in accurately representing the state of their work. Do not get into making recommendations. Simply do your best to succinctly and coherently explain what you found about the five areas of questioning discussed above. Before publishing this, you must present it to your org for their feedback to make sure you have their input on how you are describing the state of their work.
  3. Institutional Digital Preservation Next Steps Preservation Plan (Week 10): Now that you have the results of your survey, it is time to take out the NDSA levels of digital preservation and the rest of our course readings and figure out what a practical set of next steps would be for your organization.
    1. Deliverable: Post your next steps plan to the course blog (1200-3000 words). After a brief introduction providing context about the organization and its collections, you should work through reviewing  the organization’s current work on digital content using each of the areas of the NDSA levels of digital preservation. Complete by identifying three different levels (low, medium and high resource requirement) of next steps they could take to improve their rating on the NDSA levels of digital preservation. Be creative here, for example could they upload collection items to the Internet Archive or Wikimedia Commons? Or could they buy an extra hard drive and make copies and swap it with a backup buddy at another organization in a different region of the country, etc. The point here is to think about how to get them the furthest up some of the levels with the resources at hand.  Before publishing this, you should present it to your organization for them to review and provide input.
  4. Draft a Digital Preservation Policy for Your Org (Week 12): Now that you have put in place a set of recommendations, it is important to also draft up a set of digital preservation policies and practices for the organization. If this is to have any impact you are going to need to be able to articulate what the organization’s policies could be going forward.
    1. Deliverable: Drawing on the example digital preservation policies we read in class, draft up a short policy document for your institution tuned to what you have learned from working with them. Draw from the examples for models for aspects of this document. Share it with them for some input and feedback. Then Post it to the blog (800-1500 words).
  5. Reflecting on Lessons Learned (Week 13): After doing this work,presenting it, and getting feedback from your organization, you need think through what worked and didn’t work for the project. Taking time for reflection and teasing out the lessons you’ve learned about both digital preservation and working with a cultural heritage organization.
    1. Deliverable: Return to each of the documents you created thus far and synthesize 3-5 points about what did or didn’t work or what your take away lessons are from this process. Think through what you will do differently the next time you help an organization improve its digital preservation practices. Bring in references to what you’ve learned from readings in the course and from what you have learned from your classmates work on their projects (800-1400 words).

All images from Digitalbevaring.dk, published under a Creative Commons Attribution 2.5 Denmark license and created by Jørgen Stamp.

Catching up to the Present: Join the Born Digital Community of Practice

I was thrilled to have the chance to write the forward to Heather Ryan and Walker Sampson’s new book The No-Nonsense Guide to Born Digital Content. I wrote it last year, but the book just rolled out last month. It’s full of hands on practical guidance that I think complements my own forthcoming book The Theory and Craft of Digital Preservation (free OA preprint here). I checked in and Heather was Ok with me sharing it here. Excited to see work like this getting out there! 

When historians tell stories of life in the later half of the 20th and beginnings of the 21st century they will do so from an evidentiary basis of born-digital primary sources. Emails, websites, word documents, PDFs, video and audio files. It is from born digital objects like these that people of the future will come to understand our world. I continue to use the somewhat awkward phrase “born digital” because for most library, archives and museum professionals digitization remains their default conception of what digital collection content is. That needs to change. We need to catch up to the digital present and I think The No-Nonsense Guide to Born Digital Content can help us.

Librarians, archivists and museum professionals need to collectively move away from thinking about digital, and in particular born-digital as being niche topics for specialists. If our institutions are to meet the mounting challenges of serving the cultural memory functions of an increasingly digital first society the institutions themselves need to transition to become digital first themselves. We can’t just keep hiring on a handful of people with the word digital in their job titles. You don’t go to a digital doctor to get someone who uses computing as part of their medical practice and we can’t expect that the digital archivists are the ones who will be the people who do digital things in archives. The things this book covers are things that all cultural heritage professionals need to get up to speed on.  

Classic DerangeDescribe tweets.

I am thrilled to have the chance to open Heather and Walker’s book. I have known both of them directly and indirectly through our shared travels through the world of digital preservation. In what follows I offer a few of my thoughts and observations for you to take with you as you work through this book on a journey into the growing digital preservation community of practice.

To kick off your exploration of this book I will lay out three observations I believe are essential to this journey; we will never catch up, our biggest risk is inaction, and we all need to get beyond the screen in our understanding of digital information. Together, I believe these points demonstrate the need to use this book as a stepping stone, a jumping off point for joining the community of practice engaged in the craft of digital preservation.

Forever catching up to the present

I’ve borrowed part of the title of my forward from a talk Michael Edson, then the Director of Web Strategy, gave several years ago. In that talk Edson implored digital preservation practitioners to help their institutions catch up to the present. I’ve heard many talk about “the digital revolution” like it was a singular thing that happened. It wasn’t. Instead we have entered something that for the time being at least looks more like a permanent state of digital revolution. Punch cards, mainframes, personal computers, the Internet, the web, social media, mobile computing, computer vision, and now things like voice based interfaces, and the internet of things; all varying and distinct elements in the continually changing digital landscape. It doesn’t seem like we will land in a new normal, or that if there is a new normal it’s to expect a constantly changing digital knowledge ecosystem. In this context, there is much for librarians to teach and much for us to learn. We need to move more and more into a state of continual professional learning. We need to be improving our digital skills and chops by engaging in professional development and by taking on ways to become experts in new areas. This book can help you do that. In what follows I will briefly suggest three

Inaction as one of our biggest risks

There is no time to wait. Digital media is more unstable and more complex than most of the media librarians, archivists and curators have worked with. We don’t have time for a new generation of librarians and archivists to move into the field. We don’t have time for everyone to do years of professional development. Instead, we need to make space and time for working cultural heritage professionals to start engaging in the practices of digital curation. This book can be a huge help in this regard.

Get beyond the screen

Digital information isn’t just what it looks like on the screen at a given moment. To be an information professional in an increasingly digital world requires all of us to get beyond the screens in two key ways. First, we all need to develop a base level conceptual understanding of the nature of digital information. This book is helpful in that regard by providing some foundational context for understanding bitstreams and data structures. Second, we need to up our game for working with command line tools and scripts. As the pace of change around digital information develops and changes we can’t depend on the development of tools with slick graphic user interfaces. We need to accept that all the systems and platforms we use are layers and interfaces to our digital assets. That is, your content isn’t “in” whatever repository system you use, that system needs to be best understood as the current interface layer that effectively floats on-top of the digital assets you are ensuring long term access to. The hands on focus of this book and the inclusion of methods and techniques for working with data at the command line is invaluable as a jumping off point for learning this kind of skill and technique.

Embracing the craft

For more on the idea of digital preservation as craft check out my forthcoming book.

When I started working in digital preservation more than a decade ago I was largely confused and befuddled by a field that presented points of entry to the work as complex technical specifications and system requirements documents. It felt like there were a lot of people talking about how the work should be done and not a lot of people doing the work that needed to be done. I’ve been very excited to see the field turn that corner in the last decade.

We are moving further and further away from the idea that digital preservation is a technical problem that the right system can solve toward the realization that ensuring long-term access to digital information is a craft that we practice and refine by doing the work. I think this book can help us all become better reflective digital preservation practitioners. However, it can only do that if you actually start to practice it. So do that. If you aren’t already, go ahead and start to participate in the practice and join the community that is forming around these practices.

You can use this book to help to start learning by doing. You will get the most value out of this book if you are trying to work through the process of getting, describing, managing and providing access to digital content. As you go you are going to need to write down what you are doing and why you are doing it the way you are. One of my mentors, Martha Anderson, would always describe digital preservation as a relay race. You’re just one of the first runners in a great chain of runners carrying content forward into the future. When those folks in the future inherit your content they are going to need to understand why you did what you did with it and the only way they are going to be able to do that is by reading the documentation you produced regarding the how and the why of all the choices you’ve made. So be sure to write that down. I would also implore you to share what you write as you go.

It’s dangerous to go alone, take this community with you.

Around every corner there is another new kind of content. There is another challenging issue regarding privacy, ethics and personal information. There is another set of questions about how to describe and make content discoverable. There is another new kind of digital format, another new interface, and another new form of digital storage. You can’t do this alone. The good news is that everyone working on these issues in libraries, archives, museums, nonprofits, government, and companies can share what we figure out as we work through this process and build a global knowledge base of information about this work together. Take this book as a jumping off point.

Join digital preservation focused organizations like the National Digital Stewardship Alliance, the Research Data Alliance, the International Internet Preservation Consortium, the Electronic Records Section of the Society of American Archivists, and the Digital Preservation Coalition. Go to their conferences, start following people involved in these groups on twitter, follow their journals, their blogs, and their email lists.

It’s dangerous to go alone! Take this book as the starting point of a journey into our community of practice and realize that you are not alone. Even if it really is just you working on digital preservation as a lone arranger at a small organization the rest of us are out here working away at the same problems.

Parsimony and Elegance as Objectives for Digital Curation Processes

I’m increasingly convinced that parsimony and elegance are key values for the socio-technical systems that enable long term access to information. This post is me starting to try and articulate what I mean by that and connecting that back to a few ongoing strands of work and thinking I’m engaged in.

Now that the book as been circulating around a bit, I’ve been able to both reflect on it and get to have a lot of great conversations with people about it. Along with that, I’ve been participating (or at least trying to participate when my calendar allows) in some ongoing conversations about the role of maintenance, capacity, care, and repair in library work.

My points of entry into these conversations have been Bethany Nowviskie’s  Capacity and Care, Steve Jackson’s piece Rethinking Repair, Hellel Arnold’s Critical Work: Archivists as Maintainers, and Andrew Russell and Lee Vinsel’s work in pieces like Innovation is overvalued: Maintenance often matters more. As I mentioned in a pervious post, I think there is a ton more that I need to sort through in Nell Nodding’s line of thinking on an ethics of care, and that is all tied up in this too. So take those as trail heads to what I think is going to grow more and more into a major part of our professional discourse. Notions of capacity and maintain all implicate notions of sustainability.

Less is More Sustainable and Mantainable

The specific prompt for this post was one conversation where I ended up saying something I’ve said a few times before. Something like; “If you can do it with an Access database then don’t gather requirements for a software engineering project.” Furthermore, “If you can do it with a spreadsheet, don’t build an Access database.” Beyond that, “If you can do it with a text file, then don’t set up a spreadsheet.” The general point in each of these situations is that you want to use the least possible tool for the job and then when the complexity of the work demands it, you justify the added complexity of the next thing.

If when you get to the point where you need something more complex you are going to know a lot about what you really need. Sneakerneting your way through a workflow end to end is going to enable you to figure out what the process really involves and needs. The last thing you want to do is spend three years in meetings gathering requirements based on what you think you might need.

I often recall some smart stuff that the 37 Signals crew have avowed, namely that “Until you’ve actually thrown the ball at the wall, you don’t know how it’ll bounce back.” It seems to be true for software, for workflows, for procedures, for org structures. You name it.

Parsimony and Elegance

I’m becoming increasingly convinced that concepts of parsimony, elegance, and simplicity have a core place as anchors in the work of digital preservation and curation.

For some context, here I intend the definition of parsimony as;

“Using a minimal number of assumptions, steps, or conjecture”

and the definition of elegance as;

The beauty of an idea characterized by minimalism and intuitiveness while preserving exactness and precision

That is, our workflows, processes, and systems are parsimonious to the extent that they use “minimal number of assumptions or steps.” They are elegant to the extent that they are characterized by “minimalism and intuitiveness while preserving exactness and precision.” This isn’t to say that this infrastructure won’t become complex, but to say that it should only be as complex as it absolutely needs to be.

All Unnecessary Added Complexity is a Sustainability Threat

One of the core activities of digital curation and preservation work is imagining what happens when particularly things might go wrong. “What if this thing broke?” Or, “What if so-and-so took a different job, you know the one who built this really complicated piece of software?” Or ,”What if the the other organizations investing developer time in this complex application we are using shifted to invest their time in something else? Or, “What would happen if this company we are paying to provide this platform or service changed their business model?” In all of these cases, the more dependent you are on something the more risk you expose yourself to.

Significantly, you must expose yourself to risks. You’ve got to be dependent on a bunch of things, you just want to be deliberate about what you are being dependent on. You need exit strategies for your exit strategies. But in all of that you can take heart that the less complex the platforms, tools, services, processes you use are the easier it will be to move on to whatever the next thing of those is going to be. Believe me, the next thing is always coming. Whatever tools, processes, systems, methods you use today are just the things you use today. The shiny new thing of today will be the old crummy thing that you want nothing to do with tomorrow.

Relevant Axioms

Below are the axioms from my book that I think are most relevant/imply some of the points I’ve tried to make about parsimony and elegance.

1. A repository is not a piece of software. Software cannot preserve anything. Software cannot be a repository in itself. A repository is the sum of financial resources, hardware, staff time, and ongoing implementation of policies and planning to ensure long-term access to content. Any software system you use to enable you preserving and providing access to digital content is by necessity temporary. You need to be able to get your stuff out of it because it likely will not last forever. Similarly, there is no software that “does” digital preservation.

3. Tools can get in the way just as much as they can help. Specialized digital preservation tools and software are just as likely to get in the way of solving your digital preservation problems as they are to help. In many cases, it’s much more straightforward to start small and implement simple and discrete tools and practices to keep track of your digital information using nothing more than the file system you happen to be working in. It’s better to start simple and then introduce tools that help you improve your process then to simply buy into some complex system without having gotten your house in order first.

4. Nothing has been preserved, there are only things being preserved. Preservation is the result of ongoing work of people and commitments of resources. The work is never finished. This is true of all forms of preservation; it’s just that the timescales for digital preservation actions are significantly shorter than they tend to be with the conservation of things like books or oil paintings. Try to avoid talking about what has been preserved; there is only what we are preserving. This has significant ramifications for how we think about staffing and resourcing preservation work. If you want to evaluate how serious an organization is about digital preservation don’t start by looking at their code, their storage architecture, or talking to their developers. Start by talking to their finance people. See where digital preservation shows up in the budget. If an organization is serious about digital preservation it should be evident from how they spend their money. Preservation is ongoing work. It is not something that can be thought of as a one time cost.

9. Digital preservation is about making the best use of your resources to mitigate the most pressing preservation threats and risks. You are never done with digital preservation. It is not something that can be accomplished or finished. Digital preservation is a continual process of understanding the risks you face for losing content or losing the ability to render and interact with it and making use of whatever resources you have to mitigate those risks.

12. Highly technical definitions of digital preservation are complicit in silencing the past. Much of the language and specifications of digital preservation have developed into complex sets of requirements that obfuscate many of the practical things anyone and any organization can do to increase the likelihood of access to content in the future. As such, a highly technical framing of digital preservation has resulted in many smaller and less resource rich institutions feeling like they just can’t do digital preservation, or that they need to hire consultants to tell them about complex preservation metadata standards when what they need to do first is make a copy of their files.

Full Draft of Theory & Craft of Digital Preservation

Here it is, the book printed out for the first time. Or I suppose more accurately, a digital photo of the book printed out for the first time.

This weekend I’m submitting the full draft of the manuscript for my book The Theory and Craft of Digital Preservation to the publisher, Johns Hopkins University Press.

Update: to make it easier to read, I’ve shared a PDF preprint of the whole draft.

I’ve had a lot of fun working on this on nights and weekends over the last year. I have also learned a ton from everyone who has read drafts of the work in progress.

I’ve had a few folks reach out to me after reading parts of drafts and say things like “I’d love to read more of this. When will it be out?” I’m not sure exactly how long it will take for the next round of review and all the improvements that will come from working with a great press. With that said, drafts of the entire book are now online. Instead of having folks pick through my previous blog posts with the links, I figured I would put them all together in order in this post.

So to that end, below you can find an index to the eight chapters and the intro and conclusion. I’m going to leave this up with all the comments in them. I went through and resolved comments offline in my own copies of these but thought it would be fun to leave up the messy original drafts and a record of all the great input and ideas that folks have offered up to improve the text.

Table of Contents

Introduction Beyond Digital Hype & Digital Anxiety (7 pages)

Section One: Theory of Digital Preservation

Ch 1: Preservation’s Divergent Lineages (14 pages)

Ch 2: Understanding Digital Objects (12 pages)

Ch 3: Challenges & Opportunities for Digital Preservation  (11 pages)

Section Two: The Craft of digital Preservation

Ch 4: The Craft of Digital Preservation (6 pages)

Ch 5: Preservation Intent & Collection Development (13 pages)

Ch 6: Managing Copies & Formats (15 pages)

Ch 7: Arranging & Describing Digital Objects (19 pages)

Ch 8: Enabling Multimodal Access & Use (18 pages)

Conclusion: Tools for Looking Forward (9 pages)

Advance Twitter Praise for the Book

I pulled out a few fun tweets from folks responding to the book that I thought were fun to share.

https://twitter.com/save4use/status/877128435696619520

https://twitter.com/supalaze/status/877272995735142400

Theory & Craft of Digital Preservation: My Next Book

Some class notes from Alice Rogers in my digital preservation seminar.

Some class notes from Alice Rogers in my digital preservation seminar.

This has been brewing for a while, but it’s now enough of a thing that I can share about it. I am excited to announce that I’m on the hook with Johns Hopkins University Press to produce a short book (30-40k words) called The Theory & Craft of Digital Preservation: An Introduction.

I have about half of the book together in a really rough draft form. Much of my nights and weekends for about the next six months will be spent working up the rest of it and getting the whole thing together.

The genesis of the book came when I was designing my digital preservation seminar and realized that I feel like much of the beaten path for talking about digital preservation has more to do with how we got to what we do now than how it would make sense to explain the issues and topics to folks from scratch. So the course has given me a chance to try out the road-map for the book.

I’ve gotten the OK to share drafts of the chapters as they start to come together. I’ve found that I benefit dramatically from doing my writing in the open where folks can help me refine and sharpen my ideas before they end up fixed in any particular medium.

To that end, I figured I would share most of the book proposal I worked up. In working on drafting, some of this has started to shake out a bit differently, but I thought folks might be interested in a preview. I’m thinking I will start posting a chapter or two a month early-ish in the new year.

Overview of the Book

The historical record is increasingly digital. Over the last half century, under headings of “electronic records management” and “digital preservation,” librarians, archivists, and curators have established practices to ensure that our digital scientific, social and cultural record will be available to scholars and researchers into the future. This book is intended as a point of entry into that theory and practice.

Through years of leading collaborative national digital strategy efforts to ensure long-term access to digital content, I have observed that many experts in digital media and libraries, archives and museums often end up talking past each other as they work toward their mutual goals. All too often, discussions of digital preservation fail to fully state and engage with the nature digital objects and media, thereby undermining our ability to fully engage do this work in a common and coherent fashion.

This failure of understanding is rooted in two key fundamental issues: First, that preservation itself is not a single area of activity, but has always been historically intertwined with distinct disciplines that have grappled with the affordances of various historically “new” mediums. Second, that there are distinct affordances of digital media that require rethinking those diverse perspectives on preservation and conservation. The central contribution of this book is to put the lineages of preservation in dialog with the affordances of digital media as basis to articulate a theory and craft of digital preservation.

As a guidebook and an introduction, this text is a synthesis of extensive reading, research, writing, and speaking on the subject of digital preservation. It is grounded in my work on digital preservation at the Library of Congress and before that, working on digital humanities projects at the Center for History and New Media at George Mason University.  The first section of the book synthesizes work on the history of preservation in a range of areas (archives, manuscripts, recorded sound, etc.) and sets that history in dialog with work in new media studies, platform studies, and media archeology. The later chapters build from this theoretical framework as a basis for an iterative process for the practice of doing digital preservation.

This book serves as both a basic introduction to the issues and practices of digital preservation and a theoretical framework for deliberately and intentionally approaching digital preservation as a field with multiple lineages.  The intended audience is current and emerging library, archive, and museum professionals as well as the scholars and researchers who interface with these fields. As such, the book will be useful as assigned reading for graduate courses in digital preservation and digital curation in library science, museum studies, and public history programs. This book is also highly relevant to digital humanities programs and courses as the work of digital humanists increasingly results in the development of digital platforms, tools and resources which face significant sustainability challenges and thus require an understanding of digital preservation planning to succeed.

There are a handful of books on digital preservation, but this book is significantly different in two key ways. First, it is intentionally brief. Because of this, it is more accessible and usable by a wide range of stakeholders in digital preservation. This is not to an exhaustive work on the subject, but a clear and focused perspective and approach. Second, it treats digital preservation as a craft and anchors it in work in humanities scholarship on media and mediums. Much of the extent work on digital preservation approaches the subject as one that is highly technical, which continues to obfuscate many key issues and assumptions, particularly for humanities scholars interested in understanding digital preservation. While the book has a practical bent, it is not a how-to book that would quickly become outdated. It establishes and offers stages and processes for doing digital preservation, but it is not tied to particular tools, methods, or techniques. Instead, it is anchored in an understanding of the traditions of preservation and the nature of digital objects and media.

Sections of the Book 

Introduction: Getting Beyond Digital Hyperbole

At a summit on digital preservation at the U.S. Library of Congress in the early 2000s, a participant from a technology company proposed, “Why don’t we just hoover it all up and shoot it into space.” The “it” in this case being any and all historically significant digital content. Many participants laughed, but it wasn’t intended as a joke. Many have, and continue to seek similar “moon-shots,” singular technical solutions to the problem of enduring access to digital information.

More than a decade later, we find ourselves amid the same set of stories we have heard for at least thirty years. Among the public, there is a persistent belief that if something is on the Internet, it will be around forever.  At the same time, warnings of a potential impending “digital dark age,” where records of the recent past become completely lost or inaccessible appear with regular frequency in the popular press as well.

To many, it seems like the world needs someone to design a system that can “solve” the problem of digital preservation. The wisdom of the cohort of digital preservation practitioners in libraries, archives, and museums who have been doing this work for half a century suggests this is an illusory dream not worth chasing. Working to ensure long-term access to digital information is not a problem for a tool to solve. It is a complex field with a significant ethical dimension. It is a vocation.

The purpose of this book is to offer a path for getting beyond the hyperbole and the anxiety of the digital and establish a baseline for practice in this field. To do this, one needs to first unpack what we mean by preservation. It is then critical to establish a basic knowledge of the nature of digital media and digital information. With these in hand, anyone can make significant and practical advances toward mitigating the most pressing risks of digital loss. For more than half a century, librarians, archivists, and curators have been establishing practices and approaches to ensure long-term access to digital information. Building from this work, this book provides both a sound theoretical basis for digital preservation and a well-grounded approach to its practices and craft.

Section One: Historicizing Preservation and Digital Media

Chapter One: Preservation’s Divergent Lineages

Interdisciplinary dialog about digital preservation often breaks down when an individual begins to protest “but that’s not preservation.” Preservation means a lot of different things in different contexts. Each of those contexts has a history. Those histories are tied up in the changing nature of the mediums and objects for which each conception of preservation and conservation was developed. All to often, discussions of digital preservation start by contrasting digital media to analog media.  This contrast forces a series of false dichotomies. Understanding a bit about the divergent lineages of preservation helps to establish the range of competing notions at play in defining what is and isn’t preservation.

Building on work in media archeology, this chapter establishes that digital media and digital information should not be understood as a rupture with an analog past, Instead, digital media should be understood as part of a continual process of remediation embedded in the development of a range of new mediums which afford distinct communication and preservation potential. Understanding these contexts and meanings of preservation establishes a vocabulary to articulate what aspects of an object must persist into the future for a given preservation intent.

To this end, this chapter provides an overview of many of these lineages. This includes; the culture of scribes and the manuscript tradition; the bureaucracy and the development of archival theory for arranging archives and publishing records; the differences between taxidermy and insect collecting in natural history collections and living collections like butterfly gardens and zoos; the development of historic preservation of the built environment; the advent of recorded sound technology and the development of oral history; and the development of photography, microfilming and preservation reformatting. Each episode and tradition offers a mental model to consider deploy for different contexts in digital preservation.

The purpose here is not a detailed history of lineages of preservation and the development of media, but instead to illustrate the many different conceptions of preservation exist and how those conceptions are anchored in different objectives. This overview provides readers with a focus on the distinct conceptions of what matters about an object and the innate material properties and affordances of different kinds of media as they relate to preservation.

Chapter Two: Understanding Digital Objects

Doing digital preservation requires a foundational understanding of the structure and nature of digital information and media. This chapter works to provide such a background through three related strands of new media studies scholarship. First, all digital information is material. Second, digital information is best understood as existing in and through a nested set of platforms. Third, that the database is an essential media form and metaphor for understanding the logic of digital media.

Given that digital information is always physically encoded on digital media, it is critical to recognize that the raw bit stream (the sequence of ones and zeros encoded on the original medium) have a tangible and objective ability to be recorded and copied. This provides an essential first level basis for digital preservation. It is possible to establish what the entire sequence of bits is on a given medium, or in a given file, and use techniques to create a kind of digital fingerprint for it that can then be used to verify and authenticate perfect copies.

With that noted, those bit streams are animated, rendered, and made usable through nested layers of platforms. In interacting with a digital object, computing devices interact with the structures of file systems, file formats and various additional layers of software, protocols and drivers. Drawing on examples from net art, video games, and born digital drafts of literary works, I explore multiple ways to approach them anchored in different layers of their digital platforms. The experience of the performance of an object on a particular screen, like playing a video game or reading a document, can itself obfuscate many of the important aspects of digital objects that are interesting and important but much less readily visible, like how the rules of a video game actually function or deleted text in a document which still exists but isn’t rendered on the screen.

As a result of this nested platform nature, the boundaries of digital objects are often completely dependent on what layer one considers to be the most significant for a given purpose. In this context, digital form and format must be understood as existing as a kind of content. Across these platform layers digital objects are always a multiplicity of things. For example, an Atari video game is a tangible object you can hold, a binary sequence of information encoded on that medium identical to all the other copies of that game, source code authored as a creative work, a packaged commodity sold and marketed to an audience, and a signifier of a particular historical moment. Each of these objects can coexist in the platform layers of a tangible object, but depending on which is significant for a particular purpose one should develop a different preservation approach.

Lastly, where the index or the codex can provide a valuable metaphor for the order and structure of a book, new media studies scholarship has suggested that the database is and should be approached as the foundational metaphor for digital media. From this perspective, there is no “first row” in a database, but instead the presentation and sorting of digital information is based on the query posed to the data. Given that libraries and archives have long based their conceptions of order on properties of books and paper, embracing this database logic will have significant implications for making digital material available for the long term.

Chapter Three: Challenges  & Opportunities of Digital Preservation

With an understanding of digital media and some context on various lineages of preservation, it is now possible to break down what the inherent challenges, opportunities and assumptions of digital preservation are.

We can’t count on long-lived media, interfaces, or formats. Popular digital media of all kinds Disc, Disk, and NAND Flash Wafers all degrade rather quickly — in terms of years, not decades or centuries. Many of these media are relatively complex to read, so the interfaces required to interpret them are likely to not be particularly long lived. The costs of trying to either repair these media or to fix and repair interfaces to read them rapidly becomes prohibitive. As a result, traditional notions of conservation science are, outside of some niche cases, going to be effectively useless for the long-term preservation of digital objects.

Going back to the discussions of preservation lineages, this means that digital preservation is an enterprise that can only focus on the allographic digital object. While all digital information is material, the conservation of that material over the long haul is not broadly practical. Where conservation science is concerned with the chemical and material properties of mediums and artifacts, the science of digital preservation is and will be computer science. With that said, because bitstreams are always originally encoded on tangible media and then created by, acted on and interpreted by all kinds of human made layers of software they end up presenting an extensive range of seemingly artifactual and not simply informational qualities. That is, the physical and material affordances of different digital mediums will continue to shape and structure digital content long after it has been transferred and migrated to new mediums.

Section Two: Doing Digital Preservation

Chapter Four: Articulating preservation intent

What is it about the thing you want to preserve that matters and what do you need to do to make sure it is there in the future? To many, this seems like a simple question. It is not. Too often we take for granted that there is a de facto answer to this question. However, as a result of the nested platform nature of digital information and the fact that most of what we care about is the meaning that can be made from collections of objects, it is critical to be deliberate about how we answer this question in any given situation. This is why digital preservation must be continually grounded in the articulation of preservation intent.

In some cases, someone can clearly articulate this intent at the start of a project. But  for most preservation projects it is often best to be purposeful and strategic around the preservation intention. This is particularly critical given that deciding what matters most about some set of material can lead to radically different approaches to preserving and describing it.

Through examples of the diverse types of content that different kinds of cultural heritage organizations are preserving and their intent for doing so, this chapter establishes how to articulate preservation intent and how well-articulated preservation intent makes the resulting collections easier to evaluate and more transparent for future users.

Chapter Five: From Bit Preservation to Digital Preservation

Taking into account the challenges and opportunities of digital preservation, it is important to bracket the work into two different challenges: bit preservation and digital preservation. Bit preservation, ensuring authentic copies of digital objects, is the most pressing problem. Thankfully, it is a relatively straightforward problem for which there are a range of simple solutions. With that said, ensuring those authentic copies are interpretable, comprehensible and usable is far more challenging. Thankfully, this work of digital preservation is a much less time sensitive activity.

Bit preservation is accomplished by managing multiple copies of the digital objects you want to preserve, regularly comparing digital fingerprints for those files to ensure that they are all identical, repairing or replacing copies when they fail those checks, and migrating the copies to newer media and continuing to ensure that the digital fingerprints still match. With more resources, there are better ways to systematize and automate these processes, but with relatively small collections it is still possible to do this and be confident you have authentic copies as long as someone continues to mind and tend to them.

Digital preservation is much less straightforward.  The central challenge of digital preservation is that software runs. The active and performative nature of that running is only possible through a regression of dependencies on different pieces of software that are typically tightly coupled with specific pieces of hardware. Along with this, it is important to think through if there is enough context for the digital objects for someone in the future to be able to make sense of them. Two primary strategies exist for approaching these issues: emulation and format migration. Both are discussed and a case is made for why in many cases organizations are hedging their bets and pursuing both strategies.

Chapter Six: Arranging and Describing Digital Objects

The story goes that shortly after the Library of Congress signed an agreement with Twitter to begin archiving all of the tweets, a cataloger asked “But who will catalog all those tweets?” The idea of describing billions of objects was dauntingly incompressible to those who lacked experience with the nature of digital media. Like most digital objects, tweets come with a massive amount of transactional metadata: timestamps, usernames, unique identifiers, links out to URLs on the web. Like most digital objects, the tweets can largely describe themselves.

The usability of digital information will be largely dependent on how we organize, arrange, and describe it.  Arranging and describing digital objects needs to conceptually shift to embrace the nature of digital media and to recognize a distinct transition which has occurred in terms of computability. Digital media continually generates massive amounts of metadata and because it is computable, it is also increasingly possible to process digital data to derive descriptive information and metadata. As a result, arranging and describing digital content should increasingly be focused on limited amounts of expert intervention in chunking and describing content in aggregate and leaving lower levels of description to the objects themselves.

In terms of arranging digital objects, their database nature means that unlike folders in a box or books on a shelf, by their very nature digital media come with a multiplicity of orders. This complicates core archival principles around original order. It also, requires thinking through how to chunk content into reasonable and coherent sets of information that are easier to manipulate and work with as all kinds of current and future users.

In this context, it is critical to revisit the levels of description at which librarians, archivists, and curators work to evaluate in what cases something should be treated as an “item” or a “collection” and what levels of descriptive work should be employed. Given how much objects are self- describing, it makes much more sense to take up archival practices of describing content at the collection level and explaining the scope of a collection, the context of it’s acquisition, and how and why that collection was collected and preserved and to let the lower levels of description be left to the content itself.

Similarly, many digital objects actually index, describe, and annotate other digital objects. For instance, if you take all of the links that appear in articles published in the Drudge Report, the fact that the Drudge Report linked out to those sites tells you something about them. This affords the possibility of starting to think of nearly all-digital objects as both data in their own right and metadata that describes other objects. To this end, we must increasingly think of “description” and “the described” as a fuzzy boundary.

Chapter Seven: Divergent and Multimodal Access and Use

When a user in a research library asks to see a book in an obscure language a librarian will generally bring it out and let them look at it. That librarian may have no idea how to make sense of the text, but they know how to provide access to it and it is assumed that the researcher needs to come with the skills to make sense of it. At the most basic level, we can provide this kind of access to any digital objects we are preserving.

The affordances of digital media open up significant potential for access and use of digital content. At the same time, our experience with commercial software can get in the way of letting others access digital content until one can provide a simple way for any user to double click on a digital object and have it “just work.” It is critical for us to get over the assumptions that are embedded in this mentality and embrace the divergent and multimodal nature of access that digital media present us with.

This means digital preservation practitioners need to be OK with just saying, “Here it is, have at it” and also with consistently exploring the potential for new tools and methods for providing access to digital content. Even if you don’t know how to open a given file, there are a range of emerging techniques and approaches that researchers today and in the future will be able to use in working with digital content. In addition, it is important to think through the types of access restrictions or redaction of information may be necessary.

This means we should be continually exploring ways to make digital content as broadly accessible and usable as individual files, bulk aggregates and a range of other modes. Researchers are increasingly interested in approaching all kinds of digital content as data sets for computational analysis and this requires adopting new ways of thinking about access.

Conclusions: The Theory & Craft of Digital Preservation

Digital preservation is not an exact science. It is a craft in which experts must reflexively deploy and refine their judgment to appraise digital content and implement strategies that make the most sense for minimizing the most pressing risks of loss while working to make it as widely usable and useful as it can be to its’ respective audiences. At least, that is the case I have sought to make in this book. As Stacy Eardman, digital archivist at Beloit College has noted, digital preservation is much like a lyric from the song The Have Nots, “This is the game that moves as you play.”

The craft of digital preservation is anchored in the past. It builds off of the records, files, and works of those who came before us and those who designed and set up the systems that enable the creation, transmission and rendering of their work. At the same time, the craft of digital preservation is also the work of a futurist. We must look to the past trends in the ebb and flow of the development of digital media and hedge our bets on how digital technologies of the future will play out.

My former supervisor, Martha Anderson, who worked as the Managing Director of the National Digital Information Infrastructure and Preservation Program at the Library of Congress, liked to describe digital preservation as a relay race. Digital preservation is not about a particular system, or a series of preservation actions. It is about preparing content and collections for hand offs. We cannot predict what future digital mediums and interfaces will be, or how they will work, but we can select materials from today, articulate aspects of them that matter for particular use cases, make perfect copies of them, and then work to hedge our bets on digital technology trends to try and make the next hand off as smoothly as possible.

 

 

Student Digital Preservation Consultants Looking for Small Cultural Heritage Organizations

WhatIsDP_DigitalPreservation

For many, this is where we find ourselves in organizations just starting to work on digital preservation.

I’m working on drafting up the syllabus for my digital preservation graduate seminar for the University of Maryland’s iSchool for this coming fall. I am a firm believer in learning-by-doing. I also think talking about digital preservation in the abstract, outside the very real resource and time constraints of organizations largely misses the point. As a result, I am planning to have each student work through a series of assignments where they serve as digital preservation consultants to small cultural heritage organizations.

My hope is that this will be a meaningful learning opportunity for the students, as well as a way for them to start building out a portfolio of work that will be relevant to potential future employers. I am also optimistic that this can be a way to provide some help to small cultural heritage organizations that could  benefit from having the additional manpower  think through and develop plans for helping to make the best use of resources to make their digital content more long-lived.

I wanted to share a draft of the series of assignments I am putting together for two reasons:

  • First, to get feedback and input on how to improve the assignment.  I’ve posted it as a Google Doc too, so if you have suggestions for it please feel free to write comments or suggestions directly into the doc.
  • Second, pairing students with individuals who are interested in participating in this work is going to be key. I wanted to circulate this document as a means to identify people and organizations interested in working with a student as a digital preservation consultant for their organization.

Requesting a Graduate Student Digital Preservation Consultant

I think the finish line for digital preservation is a little too close to the starting line here. But it get's at the idea :)

I think the finish line for digital preservation is a little too close to the starting line here. But it get’s at the idea 🙂

If you (and your organization) would be interested in having a University of Maryland graduate student in my digital preservation seminar focus their digital preservation consultant project on your organization please take a two minutes to fill in this 5 question form. I think this is a great opportunity for organizations for a few different reasons.

Here are some reasons to consider filling in the form for your organization. This project is a chance to:

  1. Solicit assistance thinking through digital preservation issues and planning for your organization.
  2. Provide a meaningful learning experience to someone just getting started in the field
  3. Learn t more about digital preservation as the student shares what they are learning through the class

Through the course of the assignments, students will;

  1. Document and review current practices with an organization’s digital content
  2. Draft suggestions for potential next steps to improve management of digital content grounded in the resources an organization has access too
  3. Draft a digital preservation policy for consideration for the organization

On the first day of class (September 1st), I will present the organizations that have filled out the survey my students. In the first few weeks of class I will help to pair each student with an organization for the semester.

If you are matched up with a student, the idea would be that you would commit to doing an interview or two with them about your organization’s collection and current practices for digital material and that you would review and provide input on several of their assignments (listed below).

I should underscore that it is completely fine for organizations to be literally at square one in terms of digital preservation practices and planning. So many cultural heritage organizations are just getting started with their digital preservation planning, and while it can be a bit intimidating to take some first steps in this space. There are many simple and inexpensive things organizations can be doing to mitigate risks of loss . The assignment will be most valuable for both students and organizations in cases where there is little current work  being done in digital preservation. As part of this project, students will be blogging about their work, so you and your organization will need to be OK with them sharing information about the project. This can be a bit intimidating, but by having students work on their public writing skills and inviting a broader audience into discussion about how to do this work in organizations it will help to ensure that the quality of that work is stronger and more useful. Through this public writing process, the results of the work will be more useful to both the student and to your organization.

What follows are details about the design of this assignment. This is also available in the google doc if you would like to suggest edits or make comments.

Digital Preservation Consultant Project

Here you can see a student, working synthesizing what they have found and drafting a plan.

Here you can see a student, working synthesizing what they have found and drafting a plan.

An academic understanding of the issues in digital preservation is necessary but not sufficient for  professional digital preservation work. Digital preservation is fundamentally about making the best use of what are always limited resources to best support the mission of an organization. As such, to really learn how to do digital preservation you need to apply these concepts in the practical realities of an organizational context.

Aside from participating in discussion of the course readings through the course blog, the other course assignments will require you to act as a digital preservation consultant for a cultural heritage organization. For a variety of reasons I suggest this be a small institution. Below are the five assignments you must complete over the course of the semester as part of this project.

  1. Identify Small Cultural Heritage Organization and Establish Partnership (by week 3): For most of the course assignments, you will need to find a small cultural heritage organization that you can work with as a digital preservation consultant. I have identified a list of organizations that are up for participating, but you are free to find other organizations as well. The key requirements here are that 1) they have consented to working with you 2) they have some set of digital content but 3)  their collections are not so complex that you couldn’t possibly do the project. Example institutions include an independent organization (like a house museum, a community archive or library), a small department or subset of an institution (say the archives of a student newspaper or radio station, the special collections department at a public library, or the archives in a museum).
    1. Deliverable: The output of this phase is to identify this organization and confirm that you have a commitment from them to participate. We will check in on this in class as we go, but by the date of this assignment you need to have confirmed participation of an organization that meets these requirements and have posted what organization you are working on in a list on the course website. On the site, post the name of the organization, your name (or handle) and two or three sentences about the organization and its digital content.
  2. Institutional Digital Preservation Survey (Draft by week 6 and send to your org, publish with their comments incorporated by week 8): For your organization, interview one or two staff members to get a handle on their digital collections and practices. Draw from the NSDA levels of preservation as an overall framework for conducting your survey. You will want to focus on gathering information about their practices in five key areas.
    1. First, what is the scope of their digital holdings?
    2. Second, how is that digital content currently being managed?
    3. Third, what are the staff at the organization’s perceptions of the state of their digital content (are they concerned about it, do they see it as mission critical or a nice to have, what do they see as their own self efficacy and their organization’s capacity for sustaining their content)?
    4. Forth, what kinds of digital content would the organization like to be collecting but currently isn’t?
    5. Fifth, what, if any resources, do they have that they could bring to bear on this problem (if they have some significant potential resources that’s great, but realize that there may well be very meaningful smaller resources that could be brought to bear. For example, could one staff member spend 2-4 hrs a week on digital preservation, could they bring in community volunteers, how much could they spend on things like extra hard drives etc.)  Throughout all of this, it will be important to understand what the organization’s collecting mission is. You want to begin to probe all the questions above, but you need to be able to map their answers to the NDSA levels.
    6. Deliverable: You will write and publish a post to the course blog (1200-3000 words) in which you present the findings of your survey. The post should first provide context, what is this organization what are its digital holdings what does it want to be collecting them. From there, work through presenting an accurate and coherent report of the themes and issues that came through in your interviews. At this point you are primarily interested in accurately representing the state of their work. Do not get into making recommendations. Simply do your best to succinctly and coherently explain what you found about the five areas of questioning discussed above. Before publishing this, you must present it to your org for their feedback to make sure you have their input on how you are describing the state of their work.
  3. Institutional Digital Preservation Next Steps Preservation Plan (Week 10): Now that you have the results of your survey, it is time to take out the NDSA levels of digital preservation and the rest of our course readings and figure out what a practical set of next steps would be for your organization.
    1. Deliverable: Post your next steps plan to the course blog (1200-3000 words). After a brief introduction providing context about the organization and its collections, you should work through reviewing  the organization’s current work on digital content using each of the areas of the NDSA levels of digital preservation. Complete by identifying three different levels (low, medium and high resource requirement) of next steps they could take to improve their rating on the NDSA levels of digital preservation. Be creative here, for example could they upload collection items to the Internet Archive or Wikimedia Commons? Or could they buy an extra hard drive and make copies and swap it with a backup buddy at another organization in a different region of the country, etc. The point here is to think about how to get them the furthest up some of the levels with the resources at hand.  Before publishing this, you should present it to your organization for them to review and provide input.
  4. Draft a Digital Preservation Policy for Your Org (Week 12): Now that you have put in place a set of recommendations, it is important to also draft up a set of digital preservation policies and practices for the organization. If this is to have any impact you are going to need to be able to articulate what the organization’s policies could be going forward.
    1. Deliverable: Drawing on the example digital preservation policies we read in class, draft up a short policy document for your institution tuned to what you have learned from working with them. Draw from the examples for models for aspects of this document. Share it with them for some input and feedback. Then Post it to the blog (800-1500 words).
  5. Reflecting on Lessons Learned (Week 13): After doing this work,presenting it, and getting feedback from your organization, you need think through what worked and didn’t work for the project. Taking time for reflection and teasing out the lessons you’ve learned about both digital preservation and working with a cultural heritage organization.
    1. Deliverable: Return to each of the documents you created thus far and synthesize 3-5 points about what did or didn’t work or what your take away lessons are from this process. Think through what you will do differently the next time you help an organization improve its digital preservation practices. Bring in references to what you’ve learned from readings in the course and from what you have learned from your classmates work on their projects (800-1400 words).

All images from Digitalbevaring.dk, published under a Creative Commons Attribution 2.5 Denmark license and created by Jørgen Stamp.

Build Some Rep for Digital Preservation

A quick update on the digital preservation stack exchange site proposal. As I mentioned before, there are a series of ways you can help make this proposal a reality, at this point the big task is to get 100 people to commit who have more than 200 reputation on another stack exchange site. We already have 32 people who have achieved this, so we are about a third of the way there.

This will likely be a bit of a long haul, but considering that we have managed to get this far in only about a month I think we are well on our way.

How you get reputation:

You get reputation by asking and answering questions on any of the stack exchange sites. I’ve pasted in a table from their guidelines on reputation below. You will notice that you really get reputation from having your answers or your questions voted up.

This can stack up very quickly,  for example, i’ve asked three questions on the Academia site and answered two, but those questions and answers were pretty good, so they got voted up multiple times and I ended up getting more than enough reputation to get over 200. You can see exactly what questions I asked and answered and what points I got for them here.

answer is voted up +10
question is voted up +5
answer is accepted +15 (+2 to acceptor)

Where to get stack exchange reputation

I built up my 200 reputation on the Academia site, but you can do it anywhere. The important thing is that you pick a site and get 200 rep on that site (you need 200 rep on a single site so getting a little bit on a bunch of different sites isn’t going to cut it.) The full list of sites can be a little bit intimidating, so I figured I would point folks to a few sites they could think about.

  • English Language and Usage Q&A for linguists, etymologists, and serious English language enthusiasts
  • GamingQ&A for passionate videogamers on all platforms
  • Board and Card Games Q&A for people who like playing board games, designing board games or modifying the rules of existing board games
  • Travel Q&A for road warriors and seasoned travelers
  • Photography Q&A for professional, enthusiast and amateur photographers
  • CookingQ&A for professional and amateur chefs

Take a few minutes and look over the unanswered questions on any of the sites you think you might be interested in. Take a minute to try responding to a few. Then think up some questions you might have, search to see if they are already there and if not post them. In all seriousness, you can get 200 rep on one of these sites in a very short period of time and in the process you end up getting a better understanding of how this system works.

How You Can Help Launch a Digital Preservation Q&A Site

Stack Exchange Q&A site proposal: Digital PreservationTL;DR- Please consider clicking the commit button for the proposed site. The biggest hurdle is getting people who already participate in stack exchange sites to commit, here are three ways you can help with that.

1) If you have over 200+ rep on any stack exchange site we really need you, please commit.

2) If you don’t, consider answering, asking and commenting on any one of the 80 some stack exchange sites that relate to your other interests.  It won’t take long to get 200 rep and you will learn about the system. After answering and asking two questions on the Academia site I had more than two-hundred rep.

3) Please send a link to the proposal out to others in your organization or email lists that you are on. In particular, please share this with groups of folks at your org likely to have participated in stack exchange sites, like software developers, system administrators, and folks in the sciences who you think might be interested

Now for some Background on this Idea

A few of my colleagues at a range of different national and international organizations working on digital preservation have put together a proposal for a new Stack Exchange question and answer site focused on Digital Preservation. You can see the initial definition for the site below.

At different conferences, and different projects I’m associated with I keep hearing a lot of the same kinds of questions. I feel like there should be a place where I can point folks to a solid Q&A knowledge base. While there is an abundance of good research on digital preservation, great standards documents, and a range of different levels of solid technical guidance there isn’t really a place to go where you can ask and find answers to the kinds of straightforward questions seen below.

Why Stack Exchange?

I’ve talked with folks about starting our own site, like what DH Answers did. However, in further discussion we thought it would be better to first try and see if we could get something started through the Stack Exchange process. Here are some reasons to do this thorough Stack Exchange.

  1. Built in network effects:  Many of the existing stack exchange sites, while very distinct from digital preservation, have people who overlap between them. Being on Stack Exchange means being in an integrated network of sites that others already participate in.
  2. Open Data Dumps and CC-BY Knowledge:  Importantly, all the content of Stack Exchange sites is open data in several levels. We can take it, move it, share it and build from it.
  3. Not having to support technical infrastructure is nice: Stack Exchange has a dedicated staff working on refining and enhancing their platform, so the folks who want to participate can focus on the Q&A.
  4. Outreach and Big Tent Digital Preservation:  Promoting the proposal is a chance to reach out to members of other professional and technical communities to raise awareness of digital preservation. Further, if the site is launched, being part of Stack Exchange’s network would help to generate more traffic to discussions and could help lead to a broader base of digital preservation professionals.
  5. The process of proposing the site helps conceptualize it: I already think of this as a win. Just the existing prioritized list of questions that folks have is a great resource in and of its self. Even if the proposal fails and we end up needing to think about standing up our own Q&A site the process we went through on Stack Exchange will be helpful.
  6. Getting some seasoned Stack Exchange folks involved will help digital preservationists cut their teeth on best practices for participating in Q&A sites: There is an art to composing good questions and a related art to composing good answers. Getting some seasoned Stack Exchange folks in the mix would be helpful in getting us to do this in the best and most useful way.

Background on Stack Exchange

For anyone unfamiliar with Stack Exchange their about page is a nice quick read. I’ve copied some of their info below to give a bit of context for how they describe themselves.

Stack Exchange is a growing network of individual communities, each dedicated to serving experts in a specific field. We build libraries of high-quality questions and answers, focused on each community’s area of expertise. From programmers sharing answers on parsing HTML, to researchers seeking solutions to combinatorial problems, to photographers exposing lighting techniques, our communities are built by and for those best able to define them: the experts and enthusiasts.

Other ideas? Places to Reach Out To?

If you have some other ideas for how to make this happen I want to hear them! Are there other groups to contact? I bet there are, share your ideas in the comments and I will follow up with them, or just take the lead and go contact some folks yourself.