Digital Preservation’s Place in the Future of the Digital Humanities

The following is the rough notes for a talk I gave at the University of Pittsburgh’s iSchool. I’ll likely come back later to iron out any kinks in them, but figured I would get them up sooner rather than later so here they are. Thanks to Alison Langmead for the invitation. You can review all the sides here

Ensuring long term access to digital information sounds like a technical problem; like it could be a problem for computer scientists to solve. If we could only set up the right system we could “just solve it”. Far from it.

Digital Preservation is not primarily a technical problem

I’ve become increasingly convinced that digital preservation is in fact a core problem and issue at the heart of the future of the digital humanities.

In this talk, I will suggest how some issues and themes from the history of technology, new media studies, and archival theory, gesture toward the critical role that humanities scholars and practitioners should play in framing and shaping the collection, organization, description, and modes of access to the historically contingent digital material records of contemporary society. That’s a mouthful. In short, I think there is a critical need for a dialog and conversation between work in the digital humanities and work building the collections of sources they are going to draw from.

This is a broad topic, and I am trying to pull a lot of different strands from different fields together here. So this is going to be less a comprehensive argument and more of a survey, glancing off a range of projects and ideas that point toward the important interconnections that already exist between the digital humanities and digital preservation.

What is a Digital Historian Doing with Digital Preservation

When I tell people I am a historian and I work on digital preservation I get a lot of confused looks. What on earth is a digital historian and what does it have to do with digital preservation? I’m not entirely sure what being a digital historian entails, but as far as google image search is concerned, I’m part of the definition. (It’s my picture there in the green).

What google image search thinks digital historian looks like

What google image search thinks digital historian looks like. I’m on the grass.

But back to the point, when I mention that I do digital history and I work on digital preservation I’m often asked questions like “Isn’t that IT? Isn’t that technical? Is that like computer science? Or, library science or something?” Initially I was a bit timid, in responding to these queries. I was still finding my way through a highly technical field myself. I’d assert that understanding the born digital records of our society are in fact very important to historians. But I’ve been becoming bolder in this regard.

Trying not to Define the Digital Humanities

Yes, digital preservation is a technical field, one that requires technical skills. However, it also requires extensive technical skills in, say German to be able to be a good Art Historian studying Modern German Art. An understanding of digital artifacts should be a central part of the emergent digital humanities.

What Google Image Search's Hive Mind thinks the Digital Humanities is/are.

What Google Image Search’s Hive Mind thinks the Digital Humanities is/are.

This brings us to the second part of the title. What does digital preservation have to do with the emergent field of digital humanities. The digital humanities are different things to different people and I don’t want to spend too much time trying to define it/them. Again, in google image search’s hive mind the digital humanities have something to do with word clouds, projects, debates and logos.

Working Definitions of the Digital Humanities

In any event, I see three primary areas of activity in DH.

  1. Computational Analytic Methods: Here I’m thinking about computational approaches to studying primary sources (think here of Google’s n-gram viewer, of corpus analysis, of various and sundry ways of using computers to count things and conduct distant reading),
  2. Experimentations in the Format of Scholarship: Here I’m thinking about work on the future of digital scholarly communication and publication (new kinds of journals, about digital scholarship, projects like Ed AyersValley of the Shadow, various kinds of online exhibitions and presentations of primary sources using platforms like Omeka),
  3. Interpreting the digital record: interpreting born digital primary sources. This last area is essential to the future of the first two.

If the digital humanities is ever to study the 21tst century that study is going to be based on born digital primary sources. We need forms of digital Hermeneutics, the reflexive process of interpretation at the heart of humanities scholarship, that fit with digital texts and artifacts.

Selection and Definition: Points of Contact Between Humanists and Preservers

Importantly, there are two primary issues that humanists have a lot to offer in shaping the digital historical record. Selection and Definition.

  1. Selection: What is collected and preserved
  2. Definition: What features of digital objects are significant to preserve

Selection/Collection:

We can’t count on benign neglect as a process of waiting to figure out what might matter in the future. The failure rate on most consumer grade digital media is much, much shorter than the failure rate on analog media. Further, when digital media fail it’s often complete, as opposed to being partially recoverable. To that end, there is a need for many to follow in the footsteps of projects like the Center for History and New Media’s September 11th Digital Archive, where a group of historians intervened and launched a site to crowdsource the collection of everything from text messages, emails, and other digital traces of the attacks for future historians to make sense of them. Learning lessons from areas like oral history collection, it is essential for historians to wade in and actively work to ensure that the digital ephemera of society will be available to historians of the future.

The point about selection is important, but it’s largely contiguous with current practices. Decisions about selection for collections are always fraught and contingent on the values and perspective of the collecting institution. Far more problematic, is the fact that the very essence of what a digital object is is itself contentious and dependent on the kinds of questions one is interested in.

What is Pitfall? It depends on what your research questions are.

What is Pitfall? It depends on what your research questions are.

For instance, what is Pitfall? Is it the binary source code, is it the assembly code written on the wafer inside the cartridge, is it the cartridge and the packaging, is it what the game looks like on the screen? Any Screen? Or is it what the game looked like on a cathode ray tube screen? What about an arcade cabinet that plays the game? The answer is, that these are all pitfall. However, for different people; individual scholars, patrons, users, etc. what Pitfall is is different. If humanists want to have the right kind of thing around to work from they need to be involved in pinning down what features of different types of objects matter for what circumstances.

This point is expansive, so I’ll briefly gloss it before going into depth on each of these topics. In keeping with much of the discourse of computing in contemporary society, there is a push toward technological solutionism that seeks to “solve” a problem like digital preservation. I suggest that there isn’t a problem, so much as there are myriad local problems contingent on what different communities’ value. With that said, this is not a situation of “anything goes” digital media are material, and based on inscription, a set of insights from new media studies which offers a new basis for us to develop a an approach to source analysis and criticism that has a long standing history in fields like textual scholarship

Slide12

One of the biggest problems in digital preservation is that there is a persistent belief by many that the problem at hand is technical. Or that, digital preservation is a problem that can be solved. I’m borrowing this term from Evegeny Morozov, who himself borrowed the term solutionism from architecture. Design theorist, Michael Dobbins explains, “Solutionism presumes rather than investigates the problem it is trying to solve, reaching for the answer before the questions have been fully asked.” Stated otherwise, digital preservation, ensuring long term access to digital information, is not so much a straightforward problem of keeping digital stuff around, but a complex and multifaceted problem about what matters about all this digital stuff in different current and future contexts.

The technological solutionism of computing in contemporary society can easily seduce and delude us into thinking that there could be some kind of “preserve button”. Or that we could right click on the folder of American Culture on the metaphorical desktop of the world and click “Preserve as…” In fact, as noted in the case of Pitfall! defining what it is that one wants to keep around is itself a vexing issue. In digital preservation this problem is often smuggled into the notion of “significant properties.”

 

Chimerical Significance

The problem that is all too often swept away in technical discussions of preservation is what is to be preserved. That is, in established practices for digital preservation, like web archiving, attempting to preserve rendered content is the assumed solution. Just grab the HTML and files displayed when an HTTP request is made and then play them back in a tool like the wayback machine. With that noted, it’s critical to realize that making sense of and interpreting, performing if you will, that content is itself a complex dance involving differing ideas about authenticity.

In the case of a web page, is it its source code, or what it looks like rendered? Is it what it looks like rendered on the particular version of the particular browser it was composed to be viewed on? Is it what it looks like when it runs on a computer with a particular vintage of internal memory clock that produces part of how visual elements flicker? If you are only interested in the textual record of the site, then the text is all you need. But if you are a conservator of net art and this happens to be an important work, you may need to spend considerable time doing ticky tacky work to ensure that the work retains it’s fidelity to it’s creators intent.

To make this a bit more concrete, we can turn to a small corner of a now extinct neighborhood in Geocities. For those unfamiliar, Geocities was an early online community which Yahoo! turned off in 2009. Due largely to the work of ArchiveTeam, a self described group of rogue archivists, much of Geocities was collected and distributed. Looking at a small sliver of that archive can underscore some of the issues at the heart of the problem of preserving and accessing this kind of material.

Geocities page viewed through the Internet Archive's Wayback Machine

Geocities page viewed through the Internet Archive’s Wayback Machine

Same Geocities site as presented in One Terabyte of the Kilobyte Age.

Same Geocities site as presented in One Terabyte of the Kilobyte Age.

Here are two images of archived copies of a spot in the Capitol Hill neighborhood of Geocities. This first one is what it looks like rendered on my browser at work. This second one, is what it looks like as presented in One Kilobyte of the Terabyte Age. Created by Olia Lialina & Dragan Espenschied. One Terabyte of Kilobyte Age,  is in effect a designed reenactment of geocities grounded in an articulated approach to accessibility and authenticity which plays out in an ongoing stream of posts to a tumblr account. Back to the two images: Note that the header image is missing in the first one, as displayed in my modern browser. The image is still there, but my browser isn’t doing a good job at creating a high fidelity presentation of what the site should look like.

The point is, that you can’t just “preserve it” because the essence of what matters about “it” is something that is contextually dependent on the way of being and seeing in the world that you have decided to privilege. In the case of something like Geocities, it turns out that there are a bunch of different decisions one can make about fidelity and authenticity and different collections are taking different approaches.

Dragan's take on the trade offs inherent in different approaches to authenticity and accessibility for preserving webpages.

Dragan’s take on the trade offs inherent in different approaches to authenticity and accessibility for preserving webpages.

Dragan’s vision for the presentation is anchored in this continuum of authenticity and accessibility across the entire stack of technologies at play in the presentation of a web page. That is, One Kilobyte of the Terabyte age is a kind of critical edition (a mainstay as a scholarly product) of geocities. Unlike many other web archiving projects, Dragan is very upfront about what it is that he has decided to privilege and focus on in this special collection or critical edition of geocities. The resource he has created here is both an interpretation and a point of access into some of the most significant properties of Geocities that might otherwise be lost.

In short, deciding what it is that one want’s to keep is vexing and problematic, with that said, it is critical to note that we do actually have something to hang on to here. There is in fact a there there when it comes to digital objects. Further, the work of humanities scholars to understand the fundamental forensic and textual traces of digital objects points the way forward to a hermeneutics, an interpretive approach to understanding and studying digital primary sources. The most essential work in this area is Mathew Kirshenbaum’s work in Mechanisms: New Media and the Forensic Imagination.

Materiality & Inscription

We all know that digital media is binary, that somewhere there are screens of ones and zeros doing something like in the Matrix.

Slide18

The binary essence of digital media, the one’s and the zeros of it all, are in fact texts. Inscribed at the limits of augmented human perception, the sequences of bits on a hard drive are still very much material. Inscribed in the sectors of a disk are files in formats intended to be read and interpreted by different pieces of software, software which is itself inscribed on different pieces of storage media. The point here is that the longstanding traditions of studying texts, of interpreting them, have a home at the basic root level of digital objects which are both sequences of textual information and material culture visible in magnetic flux transitions on disk or the pits on optical media.

Slide19

The structures of this media share an affinity with a strand of archival theory too.

Media and Data Structures as Fonds

Whatever your feelings about the imperative of the archivist to Respect Des Fonds, the imposition to maintain original order and to pay attention to provenance of materials, it remains a cornerstone of the identity and professional practice of archives. Attempting to maintain the original order in which materials were managed before being accessioned and making decisions when processing an archive with respect to the whole both suggest a kind of archeological or paleontological understanding of documents, records and objects. An Object’s meaning is always to be understood in context of the objects near it and the structure it is organized in.

In the analog world, it’s often difficult to infer what that order is. For instance, the Herbert A Philbrick papers came to the Library of Congress in a mixture of boxes and trash cans.

Slide21

Contrast that with the order of a floppy disk from playwright John Larson’s papers. Irrelevant of his own strategies for organizing his data, and his .trashes, the computer saves and stores information like the time he last opened the files. (For more on this example, see the work of Doug Reside, Digital Curator for the Preforming Arts New York Public Library)

Slide22

The logic of digital media, of data structures, is one of order. Even if a user tries to eschew that order, the machine insists on creating, storing and retaining all manner of technical metadata and time stamps.

The order of bits on a disk, the structure of files in a file system, the organization and structure in of data available from an API are each fonds like. Data and records accrue according to the process and logic of digital media. Just as the structure and organization of records and knowledge in the analog world says as much about the materials as what is inside them so is the same true in the digital. The layers of sediment in which something is found enables you to understand its relationship to other things. Context is itself a text to be read.

With this noted, other humanities scholars, have clarified that all too often we privilege one mode of reading that underlying data structure. Our knee jerk reaction is that what is significant about an digital object is what it looks like or does on the screen.

Screen Essentialism

Digital objects are encoded information. They are bits encoded on some sort of medium. We use various kinds of software to interact with and understand those bits. In the simplest terms software reads those bits and renders them. However, the default application for opening a file isn’t the only way to go about it. You can get a sense of how different software reads different objects by changing their file extensions and opening them with the wrong application.

For example, if you just change the file extension of an .mp3 to .txt and then open the file up in your text editor of choice, you can see what happens when your computer attempts to read an audio file as a text. Slide24

While this is a big mess, notice that you read some text in there. Notice where it says “ID3″ at the top, and where you can see some text about the object and information about the collection. What you are reading is embeded metadata, a bit of text that is written into the file. The text editor can make sense of those particular arrangements of information as text.

Slide25

Here is an.mp3 and a .wav file of the same original recording changed to a .raw file and opened in Photoshop. Look at the difference between the .mp3 on the left and the .wav on the right. What I like about this comparison is that you can see the massive difference between the size of the files visualized in how they are read as images. Notice how much smaller the black and white squares are. It’s also neat to see a visual representation of the different structure of these two kinds of files. You get a feel for the patterns in their data.

These different readings or performances of a file aren’t particularly revelatory, except to underscore that the very act of opening a file, of seeing its contents is a process of interpretation a text. The sequence of 1’s and 0’s is enacted in front of us by software. Formats and software are themselves essential actants in this performance which other humanities scholars have done great work to help us understand.

Format and Medium in Platform Study

In a detailed study of the Atari 2600, Nick Montfort and Ian Bogost suggest that the study of software inevitably involves the study of layers of software on top of software intertwined with particular pieces of hardware. For example, the tiny amounts of RAM in the 2600 resulted in a complicated problem for programmers to display graphics. They extensively discuss the game Pitfall, so we can return again to its example.

Illustration from Montfort and Bogost's Racing the Beam

Illustration from Montfort and Bogost’s Racing the Beam

This illustration shows what the game screen looks like from inside the system. Note what we see on the screen, the area with the fellow swinging there, is really just a small portion of how the game thinks of its screen. The three large areas (vertical blank, horizontal blank, and overscan, are actually where the computations necessary for keeping score and working through the game are done. In this case, being able to understand how a game like Pitfall was innovative is intimately connected to being able to actually understand the relationship between the game’s functionality and the underlying constraints of the Atari Platform. For those interested in presentation it further complicates the idea of collecting and preserving such an artifact as a more nuanced understanding of the platform continues to reveal important, seemingly hidden, characteristics of its nature.

Going forward, Bogost and Montfort’s notion of “platform studies” should be come increasingly important to those working to preserve digital artifacts.

From their perspective, the layers in these platforms provide particular affordances and constraints but are generally taken for granted by users as a part of the platform. In this case, Platform could be anything from a piece of hardware, like the 2600, a programing language like c++, Java, or Python, or a format, like MP3, or .gif, or a set of protocols, like HTTP and the DNS, or something like Adobe Flash that provides a language and runtime environment for works.

I’ll quote Montfort and Bogost’s explanation of platforms here at length as it is particularly pertinent.

By choosing a platform, new media creators simplify development and delivery in many ways. Their work is supported and constrained by what this platform can do. Sometimes the influence is obvious: A monochrome platform can’t display color, a video game console without a keyboard can’t accept typed input. But there are more subtle ways that platforms interact with creative production, due to the idioms of programming that a language supports or due to transistor-level decisions made in video and audio hardware. In addition to allowing certain developments and precluding others, platforms also encourage and discourage different sorts of expressive new media work. In drawing raster graphics, the difference between setting up one scan line at a time, having video RAM with support for tiles and sprites, or having a native 3D model can end up being much more important than resolution or color depth.

The point is as follows, the nested nature of platforms, their ties in and out of software and hardware and culture are the essential problem of digital preservation and a key question for anyone interested in long term access to our digital records to grapple with. Our world increasingly runs on software and hardware platforms. From operating streetlights and financial markets, to producing music and film, to conducting research and scholarship in the sciences and the humanities, software platforms shape and structure our lives. Software platforms are simultaneously a baseline infrastructure and a mode of creative expression. It is both the key to accessing and making sense of digital objects and an increasingly important historical artifact in its own right. When historians write the social, political, economic and cultural history of the 21st century they will need to consult the platforms of our times. As underscored already, even defining the boundaries of such works is itself a fraught and interpretive project. For this reason alone I firmly believe that digital preservation is a primary challenge which should pique the interest of digital humanists.

To recap, in work on the materiality of digital objects, in conceptions like screen essentialism, humanists are already providing critical information for those interested in collecting and preserving the digital record.

Example’s like Dragan’s work with Geocities illustrate how there is considerable value in closer collaboration here, where scholars actually dig in and create special collections or critical editions of digital records to clarify the perspective taken in their collection.

Aside from this, I think there is one other key reason that digital primary sources should cry out for the attention of digital humanities.

The Born Digital Record is Already Computable 

When I opened my talk, I noted that to many, the digital humanities is synonymous with computational approaches to studying texts. Importantly, coming around from the other side of this, consideration of digital primary source for digital preservation, we end up with far, far, far more computable data then the digitized corpora of historical texts which occupy many of those interested in doing computational research in the humanities are working from.

Where historical works must be digitized, born digital media is by definition already computable. That is, when we gather together aggregations of data, be they web archives, aggregates of selfies from instagram, or corpora of files from software packages, they are already computable.

In a talk about working with web archives, Historian Ian Milligan stated the problem concisely.

If history is to continue as the leading discipline in understanding the social and cultural past, decisive movement towards the digital is necessary. Every day most people generate born-digital information that if held in a traditional archive would form a sea of boxes, folders, and unstructured data. We need to be ready.

In short, the future of the computational humanities is itself going to be turning to the increasingly heterogeneous digital fonds, data sets, data dumps, corpora of software and images and logs of transactional data.

Slide30

The Praxis of Digital Preservation

Dialog with areas of work in the humanities are all essential to the future of digital preservation.

What we need is a generation of conservators, archivists, and historians with extensive technical chops who realize just how contingent and complex deciding what bits to keep and how to go about keeping them is.

Digital objects, artifacts, texts, and data are something more than “content” they are the material anchors, the primary sources, through which we can interpret, critique, and understand our society.

I firmly believe that ours should be a golden age for born-digital special collections, archives, troves and critical editions. The future of digital preservation is less about defining a hegemonic set of best practices, than it is about scholars, curators, conservators and archivists working together to define what it is that they value about some kind of digital content and to then go out and collect it and make it available for use to their constituencies. It is about setting definitions that are often at odds with each other but that are coherent toward their own ends.

 

Posted in Uncategorized | 2 Comments

A Draft Style Guide for Digital Collection Hypertexts

fsda

The cover of A Signal from Mars: March and Two Step, shows the rather civilized Martians relaying a piece of music to earthlings with the use of a spotlight. As featured in Messages to and From Outerspace.  A signal from Mars1901.Music DivisionThe Library of Congress.

I spent about 60% of my work hours last year selecting a thematic collection of 330 cultural heritage objects and interpreting and explicating facets of those objects  in a set of 18 linked essays. I had a style guide for questions of grammar, and the HTML structure of the layouts were rather straightforward. However, I realized rather quickly that if I was going to do this consistently I should put together my own set of guidelines for the actual structure, function and style I would use for approaching this writing project. Nothing about this is formal or official or anything like that. This is just my own personal notes, thoughts and reflections that informed how I approached framing the work.

What follows is the short list of guidelines/rules for composing online exhibition-ish narrative pages for the web which I developed for my own use. Given some recent great discussion of what the ideal for history on the web should be, I figured I would share the rules I set for myself as they might be of use to others working in this form. Ultimately, in the collection objectives section I decided to call it a “hypertext,” which ideally expresses

The Chimera of the Digital Collection Hypertext

An online only interpretive presentation of representations of cultural heritage objects is something of a chimeric creature. It’s the sort of online collection/interpretive material that all kinds of folks develop when they use platforms like Omeka—ticky-tacky interpretive analytical writing and explication alongside a massive pile of related historical primary sources for users to go out and explore on their own.

  • Part Exhibition: It’s purpose is similar in purpose to a physical museum exhibit, except that the restraints and benefits of physical space are absent. For example, an online exhibition can sprawl out forever, but you lose out on the quality of “being there” in the presence of “being there with the artifacts.
  • Part Illustrated Publication: As text and images on a web page, they are also like those “illustrated history” books, where one works through a linear narrative but can stop off to read detailed information about an image. In this case, the similarity falls off in that hypertext provides a much more networked and connective potential structure for an online text. Furthermore, while people do skim books, web reading is fundamentally different.
  • Part Expansive Collection of Sources: Where you only have the space to show an image on part of a page in a book, and there is a limit to what you can display in the physical space of an exhibit on the web you can provide links out to every page in a draft or the whole audio recording.
  • All Hypertext: Ultimately, I think the most precise term for what these things are is hypertext. A term that sadly fell out of vogue with cyberspace a while back, but a term I think is worth going back to as HTTP is itself the defining logic and form of the web.

A Ready-to-hand Draft Style Guide

I had some web writing information to work with, but I ended up working up my own style guide-ish set of rules to work from for putting together these pieces. What follows is my rundown of rules (most of which I didn’t break much). As such, the intention of this set of guidelines was to try and take the ideas of exhibition and print publications that make extensive use of deep captions and figure out how they fit into the way the web writing works and people engage with the web. I feel like these served me well, and figured others might be interested in them. I’d similarly be interested in comments/discussion of these.

  1. Every narrative page stands on it’s own: The web is not a physical space and you have no control over what page someone will see first. The result of this fact is that a well conceived online exhibition narrative page needs to stand on it’s own. That means it needs to have a compelling title that includes key terms in the page, and that the text of a page cannot assume that a reader has read any other text in the exhibition. Every page is effectively the first page/front door for some set of potential users. It’s critical that the page stand on its own and invite users for further exploration at every turn.
  2. Every caption should explicate/interpret the image/object presented. Images, audio and moving image content needs to be captioned in such a way that the captions explicate and interpret the items. It is not enough to simply say what something is but to scaffold a visitor into seeing what is important about the artifact in this context. Ideally, the way the object is presented/cropped/edited suggests part of this, that is helps to actually show and not just tell. Part of the purpose of presenting these objects is to demonstrate reading and interpreting them. As such, they should not be extraneous. For example, if one want’s to include a portrait of an individual one should not simply say it is a portrait of them. It’s necessary to suggest points in the work to read, like the way they are drawn or items they are holding and how those communicate something about how that individual is being represented in this case.
  3. Object captions should always stand on their own: The captions for objects presented should also stand on their own. Web readers skim and make use of images as a form of visual headings. As such the captions for those images should make enough sense on their own that visitors can use them as a different index to the content of the page.
  4. A new heading should break up text after every few paragraphs: Again, Web writing is different from print writing in that web readers are far more likely to skim content. Good and frequent use of headings makes it easy to skim text and further hook readers to dig into the narrative content. Think more Associated Press style and less Chicago Manual of Style.
  5. An image from an item should always be visible as one scrolls through the page: The goal is showcasing the objects, so there should always be items from the collection visible on the screen at any given moment. This focuses attention on the items while also making the page easier to explore and read. Note: This is a particularly vexing thing to deal with in responsive design for mobile devices. I’d be curious for ideas about how this point should change in a mobile situation.
  6. Each page should be in the long blog post sweet spot–700-2000 words: This length makes them substantive enough to tell an interesting story and make a few important points but keeps them from being too long that they are difficult to briefly explore. If a piece is getting significantly longer than this it could likely be broken into smaller individual pieces which would have the benefit of creating another page that serves as it’s own point of entry into the exhibition.
  7. Hyperlink text for connections and emphasis: Each two paragraphs should have at least one hyperlink connecting to an important concept in another section of the exhibit. The links underscore what matters in a given paragraph and make it easy for visitors to chart their own path through the exhibition. This is the primary power of hypertext as a medium. Think of how rich a Wikipedia page entry is with links. The goal of this, and many of these guidelines, is to create a fertile network of connections that can spur the ability for someone to get lost in the content much like people do with Wikipedia. Ideally, item pages will record essays that link to them too, making each item itself into a potential point of entry to the presentation.
  8. Links should connect consistently connect out across subsections : Each page in the exhibit should ideally include at least one hyperlink to a page in a completely different section. Silos are bad, and history is not a straightforward progression of events. If you think different thematic sections of an exhibition are coherent enough to hang together there should be connections between individual pieces as you go.
  9. Show parts of items, link out to whole items: Unlike a physical exhibition you are not limited by the size of a frame, showing one page in a book, or putting a video on loop and hoping that people will stick around for it to come back again. Good exhibition narrative pages direct a visitor’s attention to features of items that are particularly interesting in a given context, but ideally that user is just a click away from looking at the whole of a work, or seeing things next to a given letter in a particular folder. There will be cases where this is impossible as either a strain on resources to digitize, or for rights reasons. With that noted, the ideal is to put up as whole a copy of any primary sources that can be integrated in their own right and not to simply crop photos to frame to illustrate the narrative.

 What do you think?

Are there things you would add, refine, or take off the list? Do you have any suggestions for other kinds of guidance that is worth integrating with this sort of thing? What thoughts do you have about how this sort of thing would change given different potential audiences? In short, I’m curious to hear what you think of all of this.

Posted in Uncategorized | 7 Comments

Redefining the “Life of the Mind” & the Infrastructure of Knowledge in the Digital Humanities Center

If you haven’t read it, Bethany Nowviskie recent post responding to the question “Does every research library need a digital humanities center?” go do so. It’s really good. DH+Lib put out a call for further discussion/response to the issues Bethany raised so I thought I would post a few quick comments here. So, this is a quick and brief response to some of the issues raised. Something more than a tweet, but not necessarily as fully formed as some of my other blog posts.

Research Libraries as Infrastructure for Humanities Scholarship
To me, what is really exciting about the digital humanities is that a lot of the work in the field is actually about redefining what the products and process of scholarship should be. It’s not just about doing things and writing books and articles about them, it’s also about figuring out how everything from blogs, to web applications, to mobile apps, data sets, and a range of tools can themselves be scholarly products.

It’s a bit of a caricature and a gloss over a lot of the hybrid roles that libraries have played in scholarship, but I think the following is a functional definition of how many think about research libraries relationship to humanities scholars.

  1. Scholars use libraries as an access point to “the literature” (books and journal articles).
  2. Scholars then publish their work, adding to the literature.
  3. Then libraries collect that new work and the cycle repeats.

Again, there are a lot of awesome other things that research libraries do, but I’d suggest that this is the primary mode through which they are thought of. As an instrument for access to knowledge. In this bifurcation, the scholars live the life of the mind and make scholarship and the research library is the infrastructure that enables them to do so.

Redefining Products and Process of the Life of the Mind

The digital humanities centers I’m most excited about are an amazing kind of scholarly middle ground; places where scholars from different research traditions work alongside librarians, archivists, software engineers, system administrators, usability and human computer interaction experts and project managers to invent a new kind of knowledge infrastructure.

What is critical here, is that the product of scholarship; the book and the article, are being called into question. The DH center as humanities skunk-works has significant implications for the idea of who serves whom, of what scholarship itself is, and holds the potential for a significant reinvention of the roles of a range of information professionals in the work/labor/and life of the mind in research and scholarship.

Digital Humanities Centers Without Scholars

To illustrate just how independent this kind of activity can be from service to scholars, I’d suggest that one of the most successful centers of DH activity isn’t built to serve scholars as much as it’s built to serve the public. New York Public Library’s Lab, NYPL Labs, is a powerful example of what the possibilities are for the digital humanities in research libraries. In part, because it’s not a service to researchers model at all. I imagine many wouldn’t classify NYPL labs as a DH center at all, likely because it doesn’t have this kind of relationship with scholars. I’d argue that the fact that they consistently win grants from the Office of Digital Humanities as the best definition of the fact that they are a DH center. If you look across their work you see the work of engaged and thoughtful creative professionals working on reinventing the infrastructure of knowledge and scholarship. That impulse in the digital humanities has considerable value to contribute to the core mission of research libraries.

Posted in Uncategorized | Leave a comment

Read my dissertation if you like: Designing Online Communities

I defended my dissertation today. If you’re at all interested you can read the draft I defended here. While it The event brings to the end about 23 years of continuous education. (I’ve been working full time for the last seven of those, but nonetheless, going to school for the last 23 years.) While it was accepted as is, I am still going to be doing some format tweaking and copyediting as it goes through its process to get its final signatures. Ultimately that final version will go into GMU’s digital repository. With that said, several folks were interested in reading the draft as it is now, so I figured I would share it here.

accepted as is

The Ideology, Rhetoric and Logic of Online Community Over Time

The diagram below is, by and large, the crux of the argument I ended up developing in the dissertation. For the most part, ideas of online community shift toward a communitarian set of language focused on electronic democracy in the early Web. That utopian vision is further and further undercut as it turns into a discourse of permission and control. The features of early online discussion systems harden into platforms like phpBB and vBulletin and ultimately pave the way for elaborate reputation systems in social networks. It’s a lot more complicated than that, so read the dissertation if that sounds interesting.

Crux of my dissertation

 

Title: Designing Online Communities: How Designers, Developers, Community Managers, and Software Structure Discourse and Knowledge Production on the Web

Abstract: Discussion on the Web is mediated through layers of software and protocols. As scholars increasingly study communication and learning on the web it is essential to consider how site administrators, programmers, and designers create interfaces and enable functionality. The managers, administrators, and designers of online communities can turn to more than 20 years of technical books for guidance on how to design online communities toward particular objectives. Through analysis of this “how-to” literature, this dissertation explores the discourse of design and configuration that partially structures online communities and later social networks. Tracking the history of notions of community in these books suggests the emergence of a logic of permission and control. Online community defies many conventional notions of community. Participants are increasingly treated as “users”, or even as commodities themselves to be used. Through consideration of the particular tactics of these administrators, this study suggests how researchers should approach the study and analysis of the records of online communities.

Dissertation Defense

 

Posted in Uncategorized | 3 Comments

Curating Science, Software and Strides in Digital Stewardship: A Personal 2013 Year in Review

It’s that time of year. Time to take stock and provide an accounting. Looking back, all the themes I noted from 2012 carried through in 2013. That kind of continuity is itself exciting, it makes me think I’ve got a career/body of work emerging from what at times can feel like a flurry of activity and projects.

What follows is a quick run down of things I’ve been working on. This includes work from the office, from school, and those moments stolen away to write while on the commuter train spent working on a range of independent projects. In looking back I think I’ve spent a good bit of time focusing on the future of primary sources and scholarship in history, infrastructure and strategy for digital stewardship and on interpreting and presenting the history of science on the web.

Showing Bill Nye Carl Sagan's Papers, a personal highlight of the year.

Showing Bill Nye Carl Sagan’s Papers, a personal highlight of the year.

Future History

Orchestrating the Preserving.exe Software Preservation Summit: I’m very proud of the software preservation summit I played a role in this year. It was great to be able to take an idea from it’s inception about a year and a half ago through to it’s completion. There was great lead up to the meeting on the Signal blog, including this interview with Henry Lowood on video game preservation at scale. Discussions and presentations at the summit were well received, I know everybody left with a lot of excitement about some of the collections being developed and the role that emulation and virtualization is likely to play in the future of access for these collections. I’m thrilled with how well the Preserving.exe report for the meeting came out.

Meditations on Digital Objects as Primary Sources: Continuing some of my work from last year, I wrote a bit about the future of significance and equivalence, about the recursive nature of items and collections, about traces, significance and preservation, about connections between archival theory, stratigraphy and disk images,  and learned a ton doing this interview about historicizing digital preservation with perspectives from media studies and science and technology studies.

Three books essays of mine appeared in this year; Writing History in the Digital Age, Playing with the Past, and Rethoric, Composition, Play

Three books essays of mine appeared in this year; Writing History in the Digital Age, Playing with the Past, and Rethoric, Composition, Play

Digital History and the Future of Historical Scholarship: I started this year remotely offering my perspectives on the of an early career digital historian at the annual meeting of the American Historical Association. I ended up throwing down a bit on the American Historical Association’s dissertation embargo statement was asked to comment on the recent Organization of American Historians similar statement. In short, I’m becoming increasingly interested in working on the modes historians access and work with primary sources and the kinds of scholarly communication products they create as a result.

Closing in on the Dissertation: Earlier this year I defended my dissertation proposal. If you are at all interested in the history of the design and rhetoric of online communities consider reading my proposal. I’m looking forward to carrying some of that thesis work forward into some of my job next year further exploring preserving online communities and the vernacular web. I’m thrilled to report that I have a full draft of my thesis in hand and that it has already gone through one round of review by my thesis committee. I’m looking at defending the thesis in the early spring. I won’t be embargoing it, so you can expect to be able to download it in full from GMU’s open access dissertation repository and here on my website as soon as it’s done.

Some scratches from my notebook where I was figuring out some themes for my dissertation conclusions.

Some scratches from my notebook where I was figuring out some themes for my dissertation conclusions.

Exhibition in and of the Digital Age: Alongside the Digital Preservation 2013 meeting, I had the chance to coordinate CURATEcamp Exhibition: Exhibition in and of the Digital Age. Together with my un-conference-chairs Michael Edson from the Smithsonian Institution and Sharon Leon from the Roy Rosenzweig Center for History and New Media I kept the plates spinning on a great and far ranging set of discussions on the future of exhibition. There were sessions on the future of online exhibits, on visualization as a mode of exhibition, on exhibition of born digital works, and a range of other issues. You can read notes from many of the sessions up on the CURATEcamp wiki. I’m still processing and digesting some of the ideas shaken loose from the camp, so expect more from me next year on some if the ideas and implications of those discussions. Some of this percolated up in thinking through a museum’s acquisition of an historic iPhone. 

From Past Player to Past Editor: This year I took on the role of co-editor of Play the Past, alongside Shawn Graham. It’s been a lot of work, I appreciate everything Ethan Watrall did to get the blog up an running and keep it running. When I started my primary goal was to get more activity through guest posts and getting new bloggers into the fold. I’m thrilled to have Angela Cox and David Hussey join the blog and contribute a lot of amazing work alongside a range of great guest posters. In short, I think we have seen a lot of great and diverse work on the blog and I’m looking forward to seeing where it goes into the future.

Infrastructures and Strategy for Digital Stewardship

Crowds & Roles for Public in Digital Library, Archives and Museum Projects: The year started off with the publication of a lot of my ideas on public participation in cultural heritage in Digital Cultural Heritage and the Crowd, in Curator: The Museum Journal. I interviewed Arfon Smith of Galaxy Zoo and the Adler Planetarium about the role of citizen science projects in digital stewardship and cultural heritage. I also wrote a bit about the role that citizen science projects can play in informing science education. My conversation with Mary Flanagan about her Metadata Games crowdsourcing platform ended up being one of the top Signal posts for the year. This year at THATcamp prime, a group of us thought through how crowdsourcing might be applied to explore images from inside the wealth of digitized books out there, and then actually stood up an instance of Metadata Games to run against images we stripped out of some Project Guttenberg books. I tried to spark some conversation about how cultural heritage orgs could shift their workflows to better anticipate activity of the crowd but it didn’t really go anywhere. Yet.

Open Source and Digital Stewardship: I had a nice set of interviews on the role of open source in digital preservation and stewardship come out. I talked with Peter Murray on when OSS is the right choice for cultural heritage orgs. Tom Cramer and I discussed the approach that Hydra is taking. I talked with Don Mennerich from NYPL about his work on born digital manuscript materials and got some of Cal Lee’s perspective on the same issue in this interview on BitCurator.

Pushing Out the Levels of Digital Preservation: Earlier this year saw the publication of the first version of the NDSA levels of digital preservation and a paper on them. It’s the result of a great little sub group of folks from NDSA member organizations and I think we have a lot to be proud of in it. I’ve been thrilled to see all the ways this  guidance is being used to inform practice at organizations all over the place (ex. at USGS, ARTstor, TRC Canada, MetaArchive, and Mississippi’s Archives.

Contributing to the National Agenda for Digital Stewardship: I’m thrilled to have a part in shaping the first National Agenda for Digital Stewardship. I think the document is a real triumph for the NDSA, it outlines a lot of issues that matter and it’s unique in getting more than a hundred some organizations to speak with one voice about national priorities. As the co-chair of the NDSA Infrastructure working group, I had a hand in shaping a good bit of the infrastructure section.

Special Curator for a History of Science Project

This year I’ve been thrilled to have the chance to spend the bulk of my work time on a history of science project. The work is mostly finished, but it’s not out yet so I can’t talk about it much right now. But I can talk about a few pieces of that work that are public. 

The most important thing in the universe by L.M. Glackens. Cover from Puck, v. 60, November 7, 1906.

You can get a taste of some of the work I’ve been engaged in up on a two of the LC blogs. I’m rather happy with this piece I wrote about visions of earth from space before we went there, which was picked up by Smithsonian magazine and by Popular Science. I also wrote about the history of imaginary space ships.

I also wrote a series of pieces on how science teachers can use some historical astronomy items as teaching tools. I’m really happy with how each of these turned out.

Not officially a part of my work, but Marjee and I pitched a script for a Ted-Ed video called Is there a center of the universe? which I think turned out to be amazingly cool. 

Center of universe ted video

Display for the Carl Sagan Event: As part of my work I was thrilled to curate a presentation of items from the Carl Sagan papers alongside some rare astronomy books and comics and prints to illustrate how Sagan’s papers fit into both historical and fictional ideas about life on other worlds in the Library of Congress collections. A high point there for me was when I got to show Bill Nye through some of the Sagan papers.

 

Posted in Uncategorized | Leave a comment

Mass Digitization, Archives, and a Multiplicity of Orders & Arrangements

Quick, drop everything and read All Text Considered: A Perspective on Mass Digitizing and Archival Processing. It helped me think through some of what I was getting into in Implications for Digital Collections Given Historian’s Research Practices.

The abstract of the paper does a great job at explaining it’s objective, “coupling robust collection-level descriptions to mass digitization and optical character recognition to provide full-text search of unprocessed and backlogged modern collections, bypassing archival processing and the creation of finding aids.” The key point in the piece, is that it’s becoming plausible to see digitization costs as being on par with the actual processing costs of a collection. You can read this as an even more extreme take on MPLP, where digitization would potentially replace a significant part of the processing process itself. Which is exciting/intriguing for a number of reasons, one of which is as a prompt for thinking through a different kind of future for archival description and access.

The possibility of actual original order and a multiplicity of orders

Most of archival original order ends up being it’s own kind of new order. So if/when you do get around to doing some form of arrangement it’s strictly intellectual arrangement, you do so without actually moving anything.  That is, if you did still want to do processing you could do it on the digital files and then provide any number of different identifiers that resolve to the digital files. In essence, the information about original order and any further arrangement would be demoted from the central organizing factor to a relevant and important piece of metadata alongside any other pieces of metadata.  So you have the order things came in and the order the archivist worked out after processing. One would likely do some coarse level of weeding and deaccessioning in many cases before digitizing, but then once digitized a processing archivist would be able to further decide which of the scanned files should be kept and what the permissions for viewing the images are. From there, you just set different permissions, say onsite access, reading room only access, dark archive for x years, complete public access. You could then just work from a black list white list approach to whatever level of granularity an archive decided to process a given collection to. Not to mention, with OCRable archival material the OCR itself could be used to set up some heuristics for what kinds of materials to show to what users in what circumstances.

The container list for an archive enforces a single linear hierarchy on the contents of the archive. Each sheet of paper can only be in one folder, in one box, in one series.

The container list for an archive enforces a single linear hierarchy on the contents of the archive. Each sheet of paper can only be in one folder, in one box, in one series.

Linked Open Description

If the archive just commits to minting a URL structure then this process opens an exciting new future for description. That is, if every image has a URL, and the folder and collection are named in the URL (Ex http://institution.org /division/collection/series/box/folder/image ) then you (or anyone else for that matter) can create a range of descriptions and relationships of those digitized objects. If something comes in substantial disorder, Like the Herbert A. Philbrick Papers, many of which came in the trash can’s pictured here, then you just make a directory for the trash can and number the images based on the order you pull them out of the can. When you do go ahead and arrange the scans, you can do so while retaining the order they were pulled out of the trash can as a parallel set of the persistent metadata element.

The net result is that you are no longer limited by the fact that one atom is stuck in one spot. You just index the content in as many ways as you like. Much like the chaotic storage principles at the heart of the design of organizing Amazon’s warehouses you use the logic, structure and order of the database to transform the order of physical materials into something akin to the random access nature of a hard drive. The result:

  1. You get the benefit not being limited by the fact that a thing can only be in one place at a time.
  2. You are also not limited to one linear/narrative/sequential way to find things
  3. Anyone inside or outside an organization can then set up in house, or third party services, to let stewards/curators add any level of description to any arbitrary set of images. That is, internal and external agents could provide distinct data to organize and structure collection content,  which the institution could chose to harvest and display to the extent they were interested. Since you are actually minting URL’s you could then start to watch inbound links to your items from things like citations and pull those links in as a kind of descriptive trackback.
If everything is digitized and each image is given an ID then any number of different modes of arrangement could be minted and maintained referencing the images. Making it function much more like this distributed network. The Network by @nancywhite, CC-BY

If everything is digitized and each image is given an ID then any number of different modes of arrangement could be minted and maintained referencing the images. Making it function much more like this distributed network. The Network by @nancywhite, CC-BY

Paralyzing or Paralleling Workflows for Archives

I think this could also help to break up much of the serial nature of workflows for cultural heritage orgs. That is, if you digitize everything and give them persistent URLs that mean things then you could have any number of processes like arrangement, description, OCR, and even processes for automated description like topic modeling run against your materials in a much more parallel fashion. If we started giving persistent URLs to these images at the beginning of our workflows instead of at the end we can reap the benefit of running any number of jobs and processes against them simultaneously. Furthermore, these could happen on a rolling basis, that is you wouldn’t need to wait for any one process to finish before moving on to another. I wrote a bit about this idea in Paralyzing or Paralleling Workflows for THATcamp leadership and a lot of these ideas came up and were discussed at CurateCamp Processing: Processing Data/Processing Collections

All Kinds of Cans of Worms Opened

All Text Considered: A Perspective on Mass Digitizing and Archival Processing opens all kinds of different cans of worms. For some kinds of materials, the prospect of digitization and OCR could make material accessible in shorter order. With that said, it throws open the doors to figure out what exactly intellectual  control means in those circumstances, and what kind of further processing and arrangement one would want to do, or how to go about integrating automated techniques for summarizing and describing content an archivist might use to complement and extend their efforts to make an archive’s structure legible to their users.

I’d love to hear your reactions to some of my provocations here and any other thoughts and reflections the essay prompts in discussion in the comments.

Thanks to Jefferson Bailey, Thomas Padilla, and Ed Summers for comments on a draft of this post. They each had some great ideas and input. I hope they’ll bring some of their more extended comments into the comments here.

Posted in Uncategorized | 2 Comments

6 Digital Historiography and Strategy Grad Seminars I’d Love to Teach

As I’ve been working on finishing my dissertation over the last two years I haven’t had the chance to teach graduate seminars and I really miss it. I’ve twice taught American University’s History in the Digital Age course for their History and Public History program and I’d love to do that sort of thing again. Partially inspired by other very cool courses I see folks sharing syllabi from,  and as s a fun thought experiment, here are a few ideas for six grad seminars I’d love to develop and teach.

enron-email-visualziations

Visualizations of the Enron Email Archive Dataset

Understanding and Interpreting Born Digital Primary Sources: Web archives, software collections, video games, digital photographs, email archives, historical laptops, floppy disks; the world (and institutions of cultural memory) are now flush with born digital primary sources. Working directly with digital artifacts students would explore and develop practices and processes for making sense of born digital materials.

Public Digital History: Scholarly Communication, Explication and Participation on the Web: Historians and public historians write books and articles and develop exhibitions to communicate to audiences about the past. The web brings with it a range of modes for communication and dialog and significant opportunities for historians to engage with and invite participation from the people formerly known as the audience.

Figure-1-climbing

A photo of the Einstein Memorial shared on Flickr

Sites of Memory: Museums, Monuments and Memory in the Digital Age: What do you make of the trip adviser page for the Albert Einstein Memorial All the selfies people take of themselves in museums? What does the potential for augmented reality mean for the set up and presentation of historic homes?  The course explores what changes as public sites of memory become part of networked publics.

Historicizing the Digital in Digital Preservation: It’s easy to fall into the trap of thinking that digital objects are a stable and straightforward thing. In practice, electronic records, software, and digital objects have meant different things at different points in the history of computing. This would basically be a take on Allison, Brian and Jefferson’s course.

Studying the Vernacular Web: Making Sense of Records of Everyday Life from the Web: Folklorists, anthropologists, sociologists and other adherents to ethnographic research methods have developed approaches for netnography and virtual ethnography to study the ways that people are creating and developing cultures on the web. The course would focus in particular on the methodological questions inherent to studying the records of computer mediated communication.

Digital Strategy for Cultural Heritage Organizations:  Digital is increasingly becoming a key part of nearly every function of cultural heritage organizations (Libraries, Archives, Museums etc.). We are increasingly acquiring, preserving and exhibiting born-digital and digitized materials, using social media for outreach and public relations, supporting researchers and fielding reference questions through digital channels, and supporting all of that work with a substantive IT infrastructure.  Looking across each of these areas, this course would focus on exploring ideas for how organizations should be structured, about the role of software development should play, embedding “digital into the design, decision making, strategy and all the operations” of cultural heritage orgs and the role that the web should play as a platform and organizing principle for orgs.

So, if anyone from a D.C. metro area institution of higher learning wants someone to teach an awesome special topics course in the evenings after work drop me a line. Oh and please feel free to run with any of these as ideas for your own courses. There is no higher flattery than having

 

 

Posted in Uncategorized | Leave a comment

Historic iPhones: Personal Digital Media Devices in the Collection

What should a library, archive or museum do with an historic iPhone? The National Museum of American history recently acquired journalist Andy Carvin’s iPhone. The announcement about the acquisition piqued my curiosity and a set of questions.  I imagine this is something we will be seeing a lot of. The iPhones and black berries of politicians, journalists, digital artists & activists are increasingly the tools of their trades.

So, what should cultural heritage organizations do when presented with acquiring rather locked down personal media devices like this? What follows is a few of my initial strands of thought about it and a set of questions I’d be interested in hearing from others about related to this. What is it about these physical and digital objects that is significant and needs to be attended to?

My first thought regarding the acquisition of Andy Carvin’s iPhone: are they going to preserve the contents of the device, or is it the idea just to hold on to the physical artifact? That’s more or less what I asked the museum. (Erin Blasco from NMAH and I  chatted a bit about this over twitter). As I suspected, the idea is to basically to just hold on to the physical artifact.

So. What exactly is it that they have? Yes, it is his phone.  Those are scratches on it that he made, and it has his stickers on it. You can put that physical artifact on the shelf and pull it out to examine it. But if you were to ask me what my iPhone is I would mean the stuff inside it. The stuff on it. That is what my phone is.

What is your iPhone?

Is my phone the cracked one in the picture or the one I took the picture with?

Is my phone the cracked one in the picture or the one I took the picture with?

I have bad luck with iPhones. I’ve twice shattered the screen of my phone. If you’ve ever swapped out one phone to another you’ve likely had the same slightly surreal experience I’ve had. You back up your phone in iTunes. You plugin the new phone and restore it from the backup. You pop out the sim card from the old phone and stick it into the new one. Then you power up your new phone.

At that moment, you sorta have two identical digital phones. All your apps are there, all your settings come over, the wallpaper. Last time I changed out my phone I took a picture of the old cracked phone with the new one. I’d moved the ghost in the machine over from one shell to another. I guess more accurately, I’d made a full identical copy. Part of the whole idea of the iPhone as an artifact is that the physical device is supposed to disappear in user experience. It’s got almost no buttons, and the entire UI emerges through software. You’re supposed to feel that the it’s the interface, the pictures under the glass, that are the real device.

So what does that have to do with Andy Carvin’s iPhone? Well, I’d imagine he still has his phone. That what NMAH received is sorta like the cast off phone I had there in the box. He migrated his device forward and what remains is more of a time capsule. A historical moment of Andy Carvin’s iPhone. Just like I can go power up that cracked phone in the box on my shelf and see what my phone was like from 7:38 PM – 22 Aug 13, 2013 if you turn on Andy’s phone in the collection (assuming he didn’t delete everything on it before giving it) you would be able to see a moment in time of his phone. Exactly what it was like right before he transferred it’s contents to another device.

The iPhone's NAND memory

The iPhone’s NAND memory

In any event, as far as I’m concerned, a device like an iPhone is first and foremost a digital object. It’s the data on the NAND memory in there brought to life by the software in it that is what the phone is. Which leads to a bit of consideration of the digital object of the iPhone.

The Digital Object of an iPhone

Where someone can make a disk image and emulate Salmon Rushdie’s old laptops, the contents of Andy Carvin’s iPhone are  more illusive. If you have a power supply, you’ll likely be able to power this thing up and see what’s on it. Now and into the future. But getting things off of the device is itself would be more of a challenge. You could (for the time being) boot up a computer and read it like a drive to, say to get copies of all the photos and videos off of it. Or, if you had the skill set, you could go ahead and get into mobile device forensics and actually capture a full disk image of the device.

The Tweets he made from the phone aren’t in there

Much of the content of iphones, and similar devices, is pulled in over the network. So if you aren't connected, or when those services turn off eventually you won't have access to that content.

Much of the content viewed on iphones, and similar devices, is pulled in over the network. So if you aren’t connected, or when those services turn off eventually, you won’t have access to that content.

One of the points of this artifact, what matters about it, is about what Carvin did on twitter. His use of twitter as a medium for reporting. While he used this particular phone to send out those tweets, the device itself does not have copies of those tweets in it. If you booted it up and opened the twitter application on it there is a good chance that you could read his tweets, and the tweets of those folks he follows. However, you would be reading those via the device logging into twitter and downloading that content. So if you were interested in collecting his tweets, you would actually want to go out and ask him to download a copy of his twitter archive and send it over to you.

The other Smithsonian iPhone

As a point of comparison, there is at least one other iPhone in the collections of the Smithsonian Institution. Writing about the acquisition of an iPhone app, Seb Chan from the Cooper Hewitt Design Museum wrote about the iPhone they have in the collection and the inherent limitations in thinking about how to make use of that device.

The iPhone in our collection is neither powered on nor has it been kept up to date with newer software releases. Eventually the hardware itself might be considered so delicate that to power it on at all would damage it beyond repair—a curse common to many electronic objects in science and technology collections. How then do we preserve the richness and novelty of the software interfaces that were developed and contributed equally if not more than the industrial design to that device’s success?

Some open discussion questions:

This is all me just thinking out loud here. Or I guess, thinking out in bits. I’d love to hear thoughts and comments from folks on what this acquisition prompts. In particular, on any of the following four questions.

  • What should archives and museums presented with iPhones be doing with them?
  • How would you even go about attending to the digital object of the iPhone? I’d be curious to hear some ideas for how one might go about ingesting preserving and eventually providing access to the digital contents of the physical device but I’d love to hear some other folks think through that?
  • Do you know of any other examples of acquisitions of personal media devices like this? If so, I’d love to hear about the who, what, where, why of that.
  • What analogies can we draw between different kinds of artifacts museums collect and Carvin’s phone? If the guts of it die and you can’t power it up, is it like a folder that once contained a set of notes? If you can power it up, is it like a fly trapped in amber that we can study as it was preserved in a particular moment in time? Since it doesn’t have a copy of the tweets in it is it like the red phone from the white house, which would have been used to make particular calls but has practically no trace of the content of those calls in it? What other connections or parallels might you draw?
Posted in Uncategorized | 7 Comments

Google Poems on History

I thought it would be fun to see what google poems come out of history, libraries and archives. So here you go. Curious to hear if these mean anything to you.

History is…

But history isn’t…

The past is…

But the past is not…

Historians are…

Memory is…

Archives are…

 

Posted in Uncategorized | Leave a comment

Notes toward a Bizarro World AHA Dissertation Open Access Statement

Bizarrow World AHA is totally into Open Access

Bizarro World AHA would be totally into Open Access

The American Historical Association published a Statement on Policies Regarding the Embargoing of Completed History PhD Dissertations. I found myself wishing that there was some kind of bizaro world AHA. I imagine this bizarro world AHA might have made remarks based on these bullet points. These are just a rough draft. I encourage others to refine and further develop them.

  1. Assert that the scholarly society’s goals are for the proliferation of knowledge not the proliferation of a particular kind of media (like monographs) or a particular business model (like selling academic monographs, primarily to university libraries).
  2. Thank doctoral students who have made their dissertations accessible to anyone for supporting the value of sharing their research.
  3. Note that dissertations are fundamentally different than the books a university press might edit, develop and revise based on them. Beyond that, assert that open access to dissertations in no way compete with books that are developed from dissertations.
  4. Explain that the scholarly society would speak out against publishers who decided to blackball scholars who had made their dissertations publicly accessible through their universities repositories.
  5. Suggest that it is fundamentally problematic that the tenure and promotion of historians is based directly on the commercial viability of academic books. Where scholars in other disciplines often control the primary means of tenure (journal articles) in fields like history that rely on book publication those decisions are (in large part) made by academic presses.
  6. Call for members of the association to explore, and encourage the development of new models for the review and evaluation of a wide range of historical work, particularly those that make scholarship as widely accessible as possible.
  7. Note that it is a fundamental problem that career development for historians in the academy is focused on the production of books that are read by few people and encourage the community of historians to refocus their energy on how they can produce historical work that people will read and can have an impact on society.

 

 

Posted in Uncategorized | 19 Comments