Curating Science, Software and Strides in Digital Stewardship: A Personal 2013 Year in Review

It’s that time of year. Time to take stock and provide an accounting. Looking back, all the themes I noted from 2012 carried through in 2013. That kind of continuity is itself exciting, it makes me think I’ve got a career/body of work emerging from what at times can feel like a flurry of activity and projects.

What follows is a quick run down of things I’ve been working on. This includes work from the office, from school, and those moments stolen away to write while on the commuter train spent working on a range of independent projects. In looking back I think I’ve spent a good bit of time focusing on the future of primary sources and scholarship in history, infrastructure and strategy for digital stewardship and on interpreting and presenting the history of science on the web.

Showing Bill Nye Carl Sagan's Papers, a personal highlight of the year.

Showing Bill Nye Carl Sagan’s Papers, a personal highlight of the year.

Future History

Orchestrating the Preserving.exe Software Preservation Summit: I’m very proud of the software preservation summit I played a role in this year. It was great to be able to take an idea from it’s inception about a year and a half ago through to it’s completion. There was great lead up to the meeting on the Signal blog, including this interview with Henry Lowood on video game preservation at scale. Discussions and presentations at the summit were well received, I know everybody left with a lot of excitement about some of the collections being developed and the role that emulation and virtualization is likely to play in the future of access for these collections. I’m thrilled with how well the Preserving.exe report for the meeting came out.

Meditations on Digital Objects as Primary Sources: Continuing some of my work from last year, I wrote a bit about the future of significance and equivalence, about the recursive nature of items and collections, about traces, significance and preservation, about connections between archival theory, stratigraphy and disk images,  and learned a ton doing this interview about historicizing digital preservation with perspectives from media studies and science and technology studies.

Three books essays of mine appeared in this year; Writing History in the Digital Age, Playing with the Past, and Rethoric, Composition, Play

Three books essays of mine appeared in this year; Writing History in the Digital Age, Playing with the Past, and Rethoric, Composition, Play

Digital History and the Future of Historical Scholarship: I started this year remotely offering my perspectives on the of an early career digital historian at the annual meeting of the American Historical Association. I ended up throwing down a bit on the American Historical Association’s dissertation embargo statement was asked to comment on the recent Organization of American Historians similar statement. In short, I’m becoming increasingly interested in working on the modes historians access and work with primary sources and the kinds of scholarly communication products they create as a result.

Closing in on the Dissertation: Earlier this year I defended my dissertation proposal. If you are at all interested in the history of the design and rhetoric of online communities consider reading my proposal. I’m looking forward to carrying some of that thesis work forward into some of my job next year further exploring preserving online communities and the vernacular web. I’m thrilled to report that I have a full draft of my thesis in hand and that it has already gone through one round of review by my thesis committee. I’m looking at defending the thesis in the early spring. I won’t be embargoing it, so you can expect to be able to download it in full from GMU’s open access dissertation repository and here on my website as soon as it’s done.

Some scratches from my notebook where I was figuring out some themes for my dissertation conclusions.

Some scratches from my notebook where I was figuring out some themes for my dissertation conclusions.

Exhibition in and of the Digital Age: Alongside the Digital Preservation 2013 meeting, I had the chance to coordinate CURATEcamp Exhibition: Exhibition in and of the Digital Age. Together with my un-conference-chairs Michael Edson from the Smithsonian Institution and Sharon Leon from the Roy Rosenzweig Center for History and New Media I kept the plates spinning on a great and far ranging set of discussions on the future of exhibition. There were sessions on the future of online exhibits, on visualization as a mode of exhibition, on exhibition of born digital works, and a range of other issues. You can read notes from many of the sessions up on the CURATEcamp wiki. I’m still processing and digesting some of the ideas shaken loose from the camp, so expect more from me next year on some if the ideas and implications of those discussions. Some of this percolated up in thinking through a museum’s acquisition of an historic iPhone. 

From Past Player to Past Editor: This year I took on the role of co-editor of Play the Past, alongside Shawn Graham. It’s been a lot of work, I appreciate everything Ethan Watrall did to get the blog up an running and keep it running. When I started my primary goal was to get more activity through guest posts and getting new bloggers into the fold. I’m thrilled to have Angela Cox and David Hussey join the blog and contribute a lot of amazing work alongside a range of great guest posters. In short, I think we have seen a lot of great and diverse work on the blog and I’m looking forward to seeing where it goes into the future.

Infrastructures and Strategy for Digital Stewardship

Crowds & Roles for Public in Digital Library, Archives and Museum Projects: The year started off with the publication of a lot of my ideas on public participation in cultural heritage in Digital Cultural Heritage and the Crowd, in Curator: The Museum Journal. I interviewed Arfon Smith of Galaxy Zoo and the Adler Planetarium about the role of citizen science projects in digital stewardship and cultural heritage. I also wrote a bit about the role that citizen science projects can play in informing science education. My conversation with Mary Flanagan about her Metadata Games crowdsourcing platform ended up being one of the top Signal posts for the year. This year at THATcamp prime, a group of us thought through how crowdsourcing might be applied to explore images from inside the wealth of digitized books out there, and then actually stood up an instance of Metadata Games to run against images we stripped out of some Project Guttenberg books. I tried to spark some conversation about how cultural heritage orgs could shift their workflows to better anticipate activity of the crowd but it didn’t really go anywhere. Yet.

Open Source and Digital Stewardship: I had a nice set of interviews on the role of open source in digital preservation and stewardship come out. I talked with Peter Murray on when OSS is the right choice for cultural heritage orgs. Tom Cramer and I discussed the approach that Hydra is taking. I talked with Don Mennerich from NYPL about his work on born digital manuscript materials and got some of Cal Lee’s perspective on the same issue in this interview on BitCurator.

Pushing Out the Levels of Digital Preservation: Earlier this year saw the publication of the first version of the NDSA levels of digital preservation and a paper on them. It’s the result of a great little sub group of folks from NDSA member organizations and I think we have a lot to be proud of in it. I’ve been thrilled to see all the ways this  guidance is being used to inform practice at organizations all over the place (ex. at USGS, ARTstor, TRC Canada, MetaArchive, and Mississippi’s Archives.

Contributing to the National Agenda for Digital Stewardship: I’m thrilled to have a part in shaping the first National Agenda for Digital Stewardship. I think the document is a real triumph for the NDSA, it outlines a lot of issues that matter and it’s unique in getting more than a hundred some organizations to speak with one voice about national priorities. As the co-chair of the NDSA Infrastructure working group, I had a hand in shaping a good bit of the infrastructure section.

Special Curator for a History of Science Project

This year I’ve been thrilled to have the chance to spend the bulk of my work time on a history of science project. The work is mostly finished, but it’s not out yet so I can’t talk about it much right now. But I can talk about a few pieces of that work that are public. 

The most important thing in the universe by L.M. Glackens. Cover from Puck, v. 60, November 7, 1906.

You can get a taste of some of the work I’ve been engaged in up on a two of the LC blogs. I’m rather happy with this piece I wrote about visions of earth from space before we went there, which was picked up by Smithsonian magazine and by Popular Science. I also wrote about the history of imaginary space ships.

I also wrote a series of pieces on how science teachers can use some historical astronomy items as teaching tools. I’m really happy with how each of these turned out.

Not officially a part of my work, but Marjee and I pitched a script for a Ted-Ed video called Is there a center of the universe? which I think turned out to be amazingly cool. 

Center of universe ted video

Display for the Carl Sagan Event: As part of my work I was thrilled to curate a presentation of items from the Carl Sagan papers alongside some rare astronomy books and comics and prints to illustrate how Sagan’s papers fit into both historical and fictional ideas about life on other worlds in the Library of Congress collections. A high point there for me was when I got to show Bill Nye through some of the Sagan papers.


Posted in Uncategorized | Leave a comment

Mass Digitization, Archives, and a Multiplicity of Orders & Arrangements

Quick, drop everything and read All Text Considered: A Perspective on Mass Digitizing and Archival Processing. It helped me think through some of what I was getting into in Implications for Digital Collections Given Historian’s Research Practices.

The abstract of the paper does a great job at explaining it’s objective, “coupling robust collection-level descriptions to mass digitization and optical character recognition to provide full-text search of unprocessed and backlogged modern collections, bypassing archival processing and the creation of finding aids.” The key point in the piece, is that it’s becoming plausible to see digitization costs as being on par with the actual processing costs of a collection. You can read this as an even more extreme take on MPLP, where digitization would potentially replace a significant part of the processing process itself. Which is exciting/intriguing for a number of reasons, one of which is as a prompt for thinking through a different kind of future for archival description and access.

The possibility of actual original order and a multiplicity of orders

Most of archival original order ends up being it’s own kind of new order. So if/when you do get around to doing some form of arrangement it’s strictly intellectual arrangement, you do so without actually moving anything.  That is, if you did still want to do processing you could do it on the digital files and then provide any number of different identifiers that resolve to the digital files. In essence, the information about original order and any further arrangement would be demoted from the central organizing factor to a relevant and important piece of metadata alongside any other pieces of metadata.  So you have the order things came in and the order the archivist worked out after processing. One would likely do some coarse level of weeding and deaccessioning in many cases before digitizing, but then once digitized a processing archivist would be able to further decide which of the scanned files should be kept and what the permissions for viewing the images are. From there, you just set different permissions, say onsite access, reading room only access, dark archive for x years, complete public access. You could then just work from a black list white list approach to whatever level of granularity an archive decided to process a given collection to. Not to mention, with OCRable archival material the OCR itself could be used to set up some heuristics for what kinds of materials to show to what users in what circumstances.

The container list for an archive enforces a single linear hierarchy on the contents of the archive. Each sheet of paper can only be in one folder, in one box, in one series.

The container list for an archive enforces a single linear hierarchy on the contents of the archive. Each sheet of paper can only be in one folder, in one box, in one series.

Linked Open Description

If the archive just commits to minting a URL structure then this process opens an exciting new future for description. That is, if every image has a URL, and the folder and collection are named in the URL (Ex /division/collection/series/box/folder/image ) then you (or anyone else for that matter) can create a range of descriptions and relationships of those digitized objects. If something comes in substantial disorder, Like the Herbert A. Philbrick Papers, many of which came in the trash can’s pictured here, then you just make a directory for the trash can and number the images based on the order you pull them out of the can. When you do go ahead and arrange the scans, you can do so while retaining the order they were pulled out of the trash can as a parallel set of the persistent metadata element.

The net result is that you are no longer limited by the fact that one atom is stuck in one spot. You just index the content in as many ways as you like. Much like the chaotic storage principles at the heart of the design of organizing Amazon’s warehouses you use the logic, structure and order of the database to transform the order of physical materials into something akin to the random access nature of a hard drive. The result:

  1. You get the benefit not being limited by the fact that a thing can only be in one place at a time.
  2. You are also not limited to one linear/narrative/sequential way to find things
  3. Anyone inside or outside an organization can then set up in house, or third party services, to let stewards/curators add any level of description to any arbitrary set of images. That is, internal and external agents could provide distinct data to organize and structure collection content,  which the institution could chose to harvest and display to the extent they were interested. Since you are actually minting URL’s you could then start to watch inbound links to your items from things like citations and pull those links in as a kind of descriptive trackback.
If everything is digitized and each image is given an ID then any number of different modes of arrangement could be minted and maintained referencing the images. Making it function much more like this distributed network. The Network by @nancywhite, CC-BY

If everything is digitized and each image is given an ID then any number of different modes of arrangement could be minted and maintained referencing the images. Making it function much more like this distributed network. The Network by @nancywhite, CC-BY

Paralyzing or Paralleling Workflows for Archives

I think this could also help to break up much of the serial nature of workflows for cultural heritage orgs. That is, if you digitize everything and give them persistent URLs that mean things then you could have any number of processes like arrangement, description, OCR, and even processes for automated description like topic modeling run against your materials in a much more parallel fashion. If we started giving persistent URLs to these images at the beginning of our workflows instead of at the end we can reap the benefit of running any number of jobs and processes against them simultaneously. Furthermore, these could happen on a rolling basis, that is you wouldn’t need to wait for any one process to finish before moving on to another. I wrote a bit about this idea in Paralyzing or Paralleling Workflows for THATcamp leadership and a lot of these ideas came up and were discussed at CurateCamp Processing: Processing Data/Processing Collections

All Kinds of Cans of Worms Opened

All Text Considered: A Perspective on Mass Digitizing and Archival Processing opens all kinds of different cans of worms. For some kinds of materials, the prospect of digitization and OCR could make material accessible in shorter order. With that said, it throws open the doors to figure out what exactly intellectual  control means in those circumstances, and what kind of further processing and arrangement one would want to do, or how to go about integrating automated techniques for summarizing and describing content an archivist might use to complement and extend their efforts to make an archive’s structure legible to their users.

I’d love to hear your reactions to some of my provocations here and any other thoughts and reflections the essay prompts in discussion in the comments.

Thanks to Jefferson Bailey, Thomas Padilla, and Ed Summers for comments on a draft of this post. They each had some great ideas and input. I hope they’ll bring some of their more extended comments into the comments here.

Posted in Uncategorized | 2 Comments

6 Digital Historiography and Strategy Grad Seminars I’d Love to Teach

As I’ve been working on finishing my dissertation over the last two years I haven’t had the chance to teach graduate seminars and I really miss it. I’ve twice taught American University’s History in the Digital Age course for their History and Public History program and I’d love to do that sort of thing again. Partially inspired by other very cool courses I see folks sharing syllabi from,  and as s a fun thought experiment, here are a few ideas for six grad seminars I’d love to develop and teach.


Visualizations of the Enron Email Archive Dataset

Understanding and Interpreting Born Digital Primary Sources: Web archives, software collections, video games, digital photographs, email archives, historical laptops, floppy disks; the world (and institutions of cultural memory) are now flush with born digital primary sources. Working directly with digital artifacts students would explore and develop practices and processes for making sense of born digital materials.

Public Digital History: Scholarly Communication, Explication and Participation on the Web: Historians and public historians write books and articles and develop exhibitions to communicate to audiences about the past. The web brings with it a range of modes for communication and dialog and significant opportunities for historians to engage with and invite participation from the people formerly known as the audience.


A photo of the Einstein Memorial shared on Flickr

Sites of Memory: Museums, Monuments and Memory in the Digital Age: What do you make of the trip adviser page for the Albert Einstein Memorial All the selfies people take of themselves in museums? What does the potential for augmented reality mean for the set up and presentation of historic homes?  The course explores what changes as public sites of memory become part of networked publics.

Historicizing the Digital in Digital Preservation: It’s easy to fall into the trap of thinking that digital objects are a stable and straightforward thing. In practice, electronic records, software, and digital objects have meant different things at different points in the history of computing. This would basically be a take on Allison, Brian and Jefferson’s course.

Studying the Vernacular Web: Making Sense of Records of Everyday Life from the Web: Folklorists, anthropologists, sociologists and other adherents to ethnographic research methods have developed approaches for netnography and virtual ethnography to study the ways that people are creating and developing cultures on the web. The course would focus in particular on the methodological questions inherent to studying the records of computer mediated communication.

Digital Strategy for Cultural Heritage Organizations:  Digital is increasingly becoming a key part of nearly every function of cultural heritage organizations (Libraries, Archives, Museums etc.). We are increasingly acquiring, preserving and exhibiting born-digital and digitized materials, using social media for outreach and public relations, supporting researchers and fielding reference questions through digital channels, and supporting all of that work with a substantive IT infrastructure.  Looking across each of these areas, this course would focus on exploring ideas for how organizations should be structured, about the role of software development should play, embedding “digital into the design, decision making, strategy and all the operations” of cultural heritage orgs and the role that the web should play as a platform and organizing principle for orgs.

So, if anyone from a D.C. metro area institution of higher learning wants someone to teach an awesome special topics course in the evenings after work drop me a line. Oh and please feel free to run with any of these as ideas for your own courses. There is no higher flattery than having



Posted in Uncategorized | Leave a comment

Historic iPhones: Personal Digital Media Devices in the Collection

What should a library, archive or museum do with an historic iPhone? The National Museum of American history recently acquired journalist Andy Carvin’s iPhone. The announcement about the acquisition piqued my curiosity and a set of questions.  I imagine this is something we will be seeing a lot of. The iPhones and black berries of politicians, journalists, digital artists & activists are increasingly the tools of their trades.

So, what should cultural heritage organizations do when presented with acquiring rather locked down personal media devices like this? What follows is a few of my initial strands of thought about it and a set of questions I’d be interested in hearing from others about related to this. What is it about these physical and digital objects that is significant and needs to be attended to?

My first thought regarding the acquisition of Andy Carvin’s iPhone: are they going to preserve the contents of the device, or is it the idea just to hold on to the physical artifact? That’s more or less what I asked the museum. (Erin Blasco from NMAH and I  chatted a bit about this over twitter). As I suspected, the idea is to basically to just hold on to the physical artifact.

So. What exactly is it that they have? Yes, it is his phone.  Those are scratches on it that he made, and it has his stickers on it. You can put that physical artifact on the shelf and pull it out to examine it. But if you were to ask me what my iPhone is I would mean the stuff inside it. The stuff on it. That is what my phone is.

What is your iPhone?

Is my phone the cracked one in the picture or the one I took the picture with?

Is my phone the cracked one in the picture or the one I took the picture with?

I have bad luck with iPhones. I’ve twice shattered the screen of my phone. If you’ve ever swapped out one phone to another you’ve likely had the same slightly surreal experience I’ve had. You back up your phone in iTunes. You plugin the new phone and restore it from the backup. You pop out the sim card from the old phone and stick it into the new one. Then you power up your new phone.

At that moment, you sorta have two identical digital phones. All your apps are there, all your settings come over, the wallpaper. Last time I changed out my phone I took a picture of the old cracked phone with the new one. I’d moved the ghost in the machine over from one shell to another. I guess more accurately, I’d made a full identical copy. Part of the whole idea of the iPhone as an artifact is that the physical device is supposed to disappear in user experience. It’s got almost no buttons, and the entire UI emerges through software. You’re supposed to feel that the it’s the interface, the pictures under the glass, that are the real device.

So what does that have to do with Andy Carvin’s iPhone? Well, I’d imagine he still has his phone. That what NMAH received is sorta like the cast off phone I had there in the box. He migrated his device forward and what remains is more of a time capsule. A historical moment of Andy Carvin’s iPhone. Just like I can go power up that cracked phone in the box on my shelf and see what my phone was like from 7:38 PM – 22 Aug 13, 2013 if you turn on Andy’s phone in the collection (assuming he didn’t delete everything on it before giving it) you would be able to see a moment in time of his phone. Exactly what it was like right before he transferred it’s contents to another device.

The iPhone's NAND memory

The iPhone’s NAND memory

In any event, as far as I’m concerned, a device like an iPhone is first and foremost a digital object. It’s the data on the NAND memory in there brought to life by the software in it that is what the phone is. Which leads to a bit of consideration of the digital object of the iPhone.

The Digital Object of an iPhone

Where someone can make a disk image and emulate Salmon Rushdie’s old laptops, the contents of Andy Carvin’s iPhone are  more illusive. If you have a power supply, you’ll likely be able to power this thing up and see what’s on it. Now and into the future. But getting things off of the device is itself would be more of a challenge. You could (for the time being) boot up a computer and read it like a drive to, say to get copies of all the photos and videos off of it. Or, if you had the skill set, you could go ahead and get into mobile device forensics and actually capture a full disk image of the device.

The Tweets he made from the phone aren’t in there

Much of the content of iphones, and similar devices, is pulled in over the network. So if you aren't connected, or when those services turn off eventually you won't have access to that content.

Much of the content viewed on iphones, and similar devices, is pulled in over the network. So if you aren’t connected, or when those services turn off eventually, you won’t have access to that content.

One of the points of this artifact, what matters about it, is about what Carvin did on twitter. His use of twitter as a medium for reporting. While he used this particular phone to send out those tweets, the device itself does not have copies of those tweets in it. If you booted it up and opened the twitter application on it there is a good chance that you could read his tweets, and the tweets of those folks he follows. However, you would be reading those via the device logging into twitter and downloading that content. So if you were interested in collecting his tweets, you would actually want to go out and ask him to download a copy of his twitter archive and send it over to you.

The other Smithsonian iPhone

As a point of comparison, there is at least one other iPhone in the collections of the Smithsonian Institution. Writing about the acquisition of an iPhone app, Seb Chan from the Cooper Hewitt Design Museum wrote about the iPhone they have in the collection and the inherent limitations in thinking about how to make use of that device.

The iPhone in our collection is neither powered on nor has it been kept up to date with newer software releases. Eventually the hardware itself might be considered so delicate that to power it on at all would damage it beyond repair—a curse common to many electronic objects in science and technology collections. How then do we preserve the richness and novelty of the software interfaces that were developed and contributed equally if not more than the industrial design to that device’s success?

Some open discussion questions:

This is all me just thinking out loud here. Or I guess, thinking out in bits. I’d love to hear thoughts and comments from folks on what this acquisition prompts. In particular, on any of the following four questions.

  • What should archives and museums presented with iPhones be doing with them?
  • How would you even go about attending to the digital object of the iPhone? I’d be curious to hear some ideas for how one might go about ingesting preserving and eventually providing access to the digital contents of the physical device but I’d love to hear some other folks think through that?
  • Do you know of any other examples of acquisitions of personal media devices like this? If so, I’d love to hear about the who, what, where, why of that.
  • What analogies can we draw between different kinds of artifacts museums collect and Carvin’s phone? If the guts of it die and you can’t power it up, is it like a folder that once contained a set of notes? If you can power it up, is it like a fly trapped in amber that we can study as it was preserved in a particular moment in time? Since it doesn’t have a copy of the tweets in it is it like the red phone from the white house, which would have been used to make particular calls but has practically no trace of the content of those calls in it? What other connections or parallels might you draw?
Posted in Uncategorized | 7 Comments

Google Poems on History

I thought it would be fun to see what google poems come out of history, libraries and archives. So here you go. Curious to hear if these mean anything to you.

History is…

But history isn’t…

The past is…

But the past is not…

Historians are…

Memory is…

Archives are…


Posted in Uncategorized | Leave a comment

Notes toward a Bizarro World AHA Dissertation Open Access Statement

Bizarrow World AHA is totally into Open Access

Bizarro World AHA would be totally into Open Access

The American Historical Association published a Statement on Policies Regarding the Embargoing of Completed History PhD Dissertations. I found myself wishing that there was some kind of bizaro world AHA. I imagine this bizarro world AHA might have made remarks based on these bullet points. These are just a rough draft. I encourage others to refine and further develop them.

  1. Assert that the scholarly society’s goals are for the proliferation of knowledge not the proliferation of a particular kind of media (like monographs) or a particular business model (like selling academic monographs, primarily to university libraries).
  2. Thank doctoral students who have made their dissertations accessible to anyone for supporting the value of sharing their research.
  3. Note that dissertations are fundamentally different than the books a university press might edit, develop and revise based on them. Beyond that, assert that open access to dissertations in no way compete with books that are developed from dissertations.
  4. Explain that the scholarly society would speak out against publishers who decided to blackball scholars who had made their dissertations publicly accessible through their universities repositories.
  5. Suggest that it is fundamentally problematic that the tenure and promotion of historians is based directly on the commercial viability of academic books. Where scholars in other disciplines often control the primary means of tenure (journal articles) in fields like history that rely on book publication those decisions are (in large part) made by academic presses.
  6. Call for members of the association to explore, and encourage the development of new models for the review and evaluation of a wide range of historical work, particularly those that make scholarship as widely accessible as possible.
  7. Note that it is a fundamental problem that career development for historians in the academy is focused on the production of books that are read by few people and encourage the community of historians to refocus their energy on how they can produce historical work that people will read and can have an impact on society.



Posted in Uncategorized | 19 Comments

It’s Items All the Way Down

Recursion, ▓▒░ TORLEY ░▒▓, CC-BY-SA

Often folks in the world of Libraries, Archives and Museums are asked to give an account of how many items you have. At which point the person asked must come up with a way of taking account. It’s good to take account. Decisions about what constitutes an item and what constitutes a collection or a series of items is often presented as if it was a simple matter of fact when it’s actually based on a set of decisions in a given context.

The point being, items don’t exist as much as they are made by applying a set of judgement calls on things that exist. Item-ness isn’t innate as much at is the result of a process of making the world legible.

Item’s are made when we make judgement calls about the relative importance of these features:

  1. Physical distinct-ness (The item represents a physical whole, a discrete physical or digital object)
  2. Authorship/Creatorship (The item represents an intellectual whole, it comes from a particular author or creator or process)
  3. (Are there other things you would add?)

Items and their item parts: 

  • A book is an item, it is also a collection of pages which are themselves items, it might contain chapters, potentially written by different authors, each of those is an item, each of the figures printed in the book are items.
  • An archive is an item, each folder in an archive is an item, the individual letters in that folder are an item, the five letters that someone stapled together that are in that folder are an item too.
  • A videotape is an item, each of the individual recordings on that tape are items.
  • A web archive is an item, each URL in the web archive is an item, each file in the web archive is an item, each directory is a kind of item.
  • A newspaper is an item, the articles in the newspaper are items, a years worth of newspapers bound together are an item.
  • 24 hours of radio or television broadcast is an item, all of the individual shows are also items, each commercial is an item, each distinct block of air time with it’s individual commercials is an item.
  • A computer is an item, each of it’s hard-drives are items, the directories on the hard drive are items, the files in those directories are items, the sectors of the disk are items


Posted in Uncategorized | 2 Comments

Signifying & Significance: Figuring out what matters and saving the digital things that testify to that mattering

Last week I was excited to participate in as a panelist in a small conference at the Bard Graduate Center called Digital/Pedagogy/Material/Archives. The goal of the event was to bring together scholars working at the intersection of these four terms to think about how best to grapple with the challenge of archiving new forms of digital scholarship coming out of the classroom.

I’ve posted my slides on slideshare (and embedded them below) but in the course of our discussion I thought it would be helpful to share some links and brief points to round out my perspective both for those who participated in the conference and anyone who comes across my slides on the web.

Part of the goal of the meeting was to provide provocations on these terms. So what follows, and what is in my slides is intended to be a bit provocative.

ARIS & Menokin Student Project: 

I decided it would be best to share a particular case study of a student project as an example to think through what exactly someone might want to save in a digital object. I shared a place based game that Laura Heiman & Caitlin Miller (students in my History and the Digital Age course) created. The course is a research methods course for doctoral students at American university and an elective for students in their Public History Master’s program. I picked these two blog posts to share for context on the project.

For background on my approach to teaching this course, and for why I have these two blog posts to point to, check out this post I wrote a while back.

In my talk I tried to push up against some of the default assumptions that we bring to talking about preserving digital objects and artifacts. I tried to articulate a perspective on preservation that is grounded in first identifying what material objects (digital or analog) would be able to testify, or be the place where evidence exists, of the information you think is interesting. For background and context on where some of these remarks come from here are some pieces I’ve written and drawn from.

  • Significance is in the eye of the stakeholder : What’s important about this piece is the notion that objects don’t have significant properties. Instead, there are different properties of objects that are significant to different potential users/audiances/stakeholders.

Objects have an infinite number of properties, those properties include traces of the past that people can interpret as evidence for claims. With that said, to take a preservation action is to decide to act to ensure access to properties that one finds particularly important or significant.

  • The is of the digital object and the is of the Artifact : This is mostly a riff on the materiality of digital objects. What is important in this context is that there are a lot of different material artifacts that have traces of the things that we care about. I mention Ian Bogost’s example of the 11 different things that the video game E.T. the Extraterrestrial is to underscore that even defining what that game is requires an extensive set of decisions about what someone might want to do with it. Now if you want to preserve that video game you need to think about which of those 11 objects you want to have evidence about.

So, if you want to preserve E.T. the Extraterrestrial, or anything for that matter, it’s critical to realize that drawing boundaries around what the object itself is requires you to make decisions about the significance of particular properties for particular potential uses.

  • Glitching Files for Understanding: With all this said, digital objects are material objects and as such they have very real properties. Once you have identified an artifact that you care about you need to attend to the facts or features of its existence. In this post, I try to break apart many of the assumptions of screen essentialism, the idea that what digital objects look like on the screen is their essence. In contrast, digital objects are in fact bits of encoded information.

If we understand and respect the formal and forensic materiality of these objects in the context of the properties of them that we want to preserve then we are well on our way to making this work.

In this case, if you want to know about the Albert Einstein Memorial, the physical object of the statue is not where you find the evidence you need. This is not a particularly interesting point, but what is interesting is that the archival record of the memorial’s creation in the National Academy of Sciences archives is far less interesting as a source of information about what the memorial means to people than the very ephemeral information about the memorial in reviews of it on Yelp and TripAdvisor and the pictures people share of it on Flickr. This is to say that there is a wake of artifacts that exist in a network of meaning around this particular memorial and that many of the most interesting artifacts to get at what the memorial means to people are not the thing itself (the memorial) or the records about the thing itself (the documents in the archives) but are instead things people are saying about it (posts on Yelp & TripAdvisor) or doing with it (using it as a prop to take photographs). So, depending on what you think matters about the object would push you to try and save different kinds of objects that can provide potential evidence on that subject.

  • Archives in Context and as Context:  During our discussion we slide around a good bit in what we meant by archiving, archives, and preservation. So its worth linking out to Kate’s great piece on this subject as it provides good grist for pinning down what we mean at different moments about archives.

The main point here is that one needs to be careful in clarifying what one means with the word archive. Kate focuses on a few points, but one of the trickiest that comes up is the use of “archiving” as a verb or saying something “has been archived.” Archives are places, and similarly preservation is something that institutions do not something that is accomplished. Nothing is preserved, there are only things that are being preserved.  Nothing is archived, there are only things that are in archives.

If there is one thing we can count on it’s entropy. All material objects in the world are wearing down and degrading. Everything in the world eventually succumbs to it’s own inherent vices. At the end of the day the question is what traces of the world recorded on artifacts we want to commit ourselves to ensuring long term access to.

Posted in Uncategorized | Leave a comment

Designing Online Communities: Read My Accepted Dissertation Proposal

Wisdom of the Ancients: the web-comic-epigraph for my dissertation proposal, from XKCD

As of last monday, I have now successfully defended my dissertation proposal. In the context of my doctoral program, that means there is just one more hurdle to climb over to finish. I’m generally rather excited about the project, and would be thrilled to have more input and feedback on it (Designing Online Communities Proposal PDF). I would be happy for any and all comments on it in the comments of this post.

Designing Online Communities: How Designers, Developers, community Managers, And Software Structure Discourse And knowledge Production On The Web

Abstract: Discussion on the web is mediated through layers of software and protocols. As scholars increasingly turn to study communication, learning and knowledge production on the web, it is essential to look below the surface of interaction and consider how site administrators, programmers and designers create interfaces and enable functionality. The managers, administrators and designers of online communities can turn to more than 20 years of technical books for guidance on how to design and structure online communities toward particular objectives. Through analysis of this “how-to” literature, this dissertation intends to offer a point of entry into the discourse of design and configuration that plays an integral role in structuring how learning and knowledge are produced online. The project engages with and interprets “how-to” literature to help study software in a way that respects the tension that exists between the structural affordances of software with the dynamic and social nature of software as a component in social interaction.

What’s Next? 

At some point in the next year I will likely defend a completed dissertation. Places do dissertations differently, in my program the idea is that what I just defended is actually the first three chapters of a five chapter dissertation. So, at this point I need to follow through on what I said I would do in my methods section (to create chapter 4, results) and then write up how it connects with the conceptual context section (to create chapter 5, conclusions). So I should be able to grind this out in relatively short order.

At this point, I think this project should be interesting enough to warrant a book proposal. So I’ll likely start exploring putting together a book proposal for it in the next year as well. With that in mind, any suggestions for who might be interested in receiving a proposal on this topic are welcome.

Posted in graduate school, research | Tagged , , , , , | 3 Comments

Small Pieces Loosely Kludged: Peer Review and Publication in Math Scholarly Communication

I’m always interested to hear about how different scholarly communities are changing their communications practices. Things like PLOS One, and projects like PressForward are putting forward interesting and new models for when and where review happens and how we establish credibility and mark for quality. At the recent ScienceOnline conference I had the pleasure of chatting a bit with David Zureick-Brown, a mathematician and one of the founders of MathOverflow. Given how forward thinking much of the math community has been in this scholarly communication space I was thrilled to have a chance to pick his brain about similarities and differences between fields.

High Rates of Rejection in Math Journals: It works different there

I was initially taken back by something David said.  It was something like “In my field, if you aren’t getting at least a 50% rejection rate on papers you submit to journals you aren’t aiming high enough.” The idea being, that you should try to get your work into more prestigious journals, and many of these journals have two-year backlogs. In one situation, a paper was rejected that had largely positive reviews because it wasn’t important/exciting enough. This is exactly the thing that projects like PLOSone are set up to get around. To try and stop evaluating papers for quality and start doing a minimal evaluation of them as passing a minimum bar.

Publication happens before Publication

Initially I thought this sounds terrible! You submit your papers, wait for rejections, and then shift down a bit. Wouldn’t this hold up getting your work out there? But then I remembered that Math is different. At this point there is an expectation that you put all your work up on arXiv as soon as they are coherent enough to be called papers. So this review process wasn’t holding up the publication process. As soon as work is done it’s published. People start reading it on arXiv. When I realized this I suggested “Oh, so publication in a journal is actually really just like a mark of quality, it’s like a merit badge.”  Now, it’s a really important merit badge in the field, as the quality of the journals you are published in is a key factor for tenure and promotion. So getting a piece published in a particularly prestigious journal is effectively a seal of quality/approval that a given work matters and is significant for the field.

Small Pieces Loosely Kludged

This kluged together system seems like a great outcome. I can’t imagine anyone set out to make this system work this way. Anything can get published on arXiv, at which point anyone can see the work, cite the work, and reference it. The journals are now really just serving as amplifiers. The peer review of this work is actually post publication peer review. In this system it sort of doesn’t matter if journals want to become open access. If they let you put up pre-prints you’re good to go. The content of the journals is already published and open access. It only costs folks money to see the papers if they want to see the fancy PDFs.

It’s largely about when you call it a publication

So post publication peer review and pre-publication review are actually much more dependent on what we call the publication. Humanities and Social Science folks can just start to put all their stuff up in places like, or up on SSRN before submitting it. In many social sciences at this point this is a standard practice.  While I’m a big fan of institutional repositories, I find the situations where the field specific platforms have emerged a bit more exciting. In these cases, the expectations and behaviors of scholars have shifted. It’s the norm to expect that you can see your colleagues work as quickly as it’s come together online in these spaces.

So why doesn’t this happen in History and the Humanities?

The fact that arXiv, SSRN and sites like RePEc and a few other disciplinary networks emerged for sharing scholarship in draft form and that nothing like them has taken off in the humanities is an indictment of the humanities. How come Mathematicians, Astronomers, Economists and a range of other fields could just set up places to share their work and humanists haven’t? As you can see from the Math situation, if a scholarly community just shifts to sharing pre-prints and everybody does it then it basically doesn’t matter what publishers want to do in terms of open access. This is to say that scholars have no one but themselves and their peers to point to if they don’t like how scholarly communication works. As the math case shows, we can patch our scholarly communication system one kludge at a time and end up with a system that embraces broad open access and rapid dissemination and retains merit badges for quality.


Posted in Uncategorized | 1 Comment