It’s Items All the Way Down

Recursion, ▓▒░ TORLEY ░▒▓, CC-BY-SA

Often folks in the world of Libraries, Archives and Museums are asked to give an account of how many items you have. At which point the person asked must come up with a way of taking account. It’s good to take account. Decisions about what constitutes an item and what constitutes a collection or a series of items is often presented as if it was a simple matter of fact when it’s actually based on a set of decisions in a given context.

The point being, items don’t exist as much as they are made by applying a set of judgement calls on things that exist. Item-ness isn’t innate as much at is the result of a process of making the world legible.

Item’s are made when we make judgement calls about the relative importance of these features:

  1. Physical distinct-ness (The item represents a physical whole, a discrete physical or digital object)
  2. Authorship/Creatorship (The item represents an intellectual whole, it comes from a particular author or creator or process)
  3. (Are there other things you would add?)

Items and their item parts: 

  • A book is an item, it is also a collection of pages which are themselves items, it might contain chapters, potentially written by different authors, each of those is an item, each of the figures printed in the book are items.
  • An archive is an item, each folder in an archive is an item, the individual letters in that folder are an item, the five letters that someone stapled together that are in that folder are an item too.
  • A videotape is an item, each of the individual recordings on that tape are items.
  • A web archive is an item, each URL in the web archive is an item, each file in the web archive is an item, each directory is a kind of item.
  • A newspaper is an item, the articles in the newspaper are items, a years worth of newspapers bound together are an item.
  • 24 hours of radio or television broadcast is an item, all of the individual shows are also items, each commercial is an item, each distinct block of air time with it’s individual commercials is an item.
  • A computer is an item, each of it’s hard-drives are items, the directories on the hard drive are items, the files in those directories are items, the sectors of the disk are items


Posted in Uncategorized | 1 Comment

Signifying & Significance: Figuring out what matters and saving the digital things that testify to that mattering

Last week I was excited to participate in as a panelist in a small conference at the Bard Graduate Center called Digital/Pedagogy/Material/Archives. The goal of the event was to bring together scholars working at the intersection of these four terms to think about how best to grapple with the challenge of archiving new forms of digital scholarship coming out of the classroom.

I’ve posted my slides on slideshare (and embedded them below) but in the course of our discussion I thought it would be helpful to share some links and brief points to round out my perspective both for those who participated in the conference and anyone who comes across my slides on the web.

Part of the goal of the meeting was to provide provocations on these terms. So what follows, and what is in my slides is intended to be a bit provocative.

ARIS & Menokin Student Project: 

I decided it would be best to share a particular case study of a student project as an example to think through what exactly someone might want to save in a digital object. I shared a place based game that Laura Heiman & Caitlin Miller (students in my History and the Digital Age course) created. The course is a research methods course for doctoral students at American university and an elective for students in their Public History Master’s program. I picked these two blog posts to share for context on the project.

For background on my approach to teaching this course, and for why I have these two blog posts to point to, check out this post I wrote a while back.

In my talk I tried to push up against some of the default assumptions that we bring to talking about preserving digital objects and artifacts. I tried to articulate a perspective on preservation that is grounded in first identifying what material objects (digital or analog) would be able to testify, or be the place where evidence exists, of the information you think is interesting. For background and context on where some of these remarks come from here are some pieces I’ve written and drawn from.

  • Significance is in the eye of the stakeholder : What’s important about this piece is the notion that objects don’t have significant properties. Instead, there are different properties of objects that are significant to different potential users/audiances/stakeholders.

Objects have an infinite number of properties, those properties include traces of the past that people can interpret as evidence for claims. With that said, to take a preservation action is to decide to act to ensure access to properties that one finds particularly important or significant.

  • The is of the digital object and the is of the Artifact : This is mostly a riff on the materiality of digital objects. What is important in this context is that there are a lot of different material artifacts that have traces of the things that we care about. I mention Ian Bogost’s example of the 11 different things that the video game E.T. the Extraterrestrial is to underscore that even defining what that game is requires an extensive set of decisions about what someone might want to do with it. Now if you want to preserve that video game you need to think about which of those 11 objects you want to have evidence about.

So, if you want to preserve E.T. the Extraterrestrial, or anything for that matter, it’s critical to realize that drawing boundaries around what the object itself is requires you to make decisions about the significance of particular properties for particular potential uses.

  • Glitching Files for Understanding: With all this said, digital objects are material objects and as such they have very real properties. Once you have identified an artifact that you care about you need to attend to the facts or features of its existence. In this post, I try to break apart many of the assumptions of screen essentialism, the idea that what digital objects look like on the screen is their essence. In contrast, digital objects are in fact bits of encoded information.

If we understand and respect the formal and forensic materiality of these objects in the context of the properties of them that we want to preserve then we are well on our way to making this work.

In this case, if you want to know about the Albert Einstein Memorial, the physical object of the statue is not where you find the evidence you need. This is not a particularly interesting point, but what is interesting is that the archival record of the memorial’s creation in the National Academy of Sciences archives is far less interesting as a source of information about what the memorial means to people than the very ephemeral information about the memorial in reviews of it on Yelp and TripAdvisor and the pictures people share of it on Flickr. This is to say that there is a wake of artifacts that exist in a network of meaning around this particular memorial and that many of the most interesting artifacts to get at what the memorial means to people are not the thing itself (the memorial) or the records about the thing itself (the documents in the archives) but are instead things people are saying about it (posts on Yelp & TripAdvisor) or doing with it (using it as a prop to take photographs). So, depending on what you think matters about the object would push you to try and save different kinds of objects that can provide potential evidence on that subject.

  • Archives in Context and as Context:  During our discussion we slide around a good bit in what we meant by archiving, archives, and preservation. So its worth linking out to Kate’s great piece on this subject as it provides good grist for pinning down what we mean at different moments about archives.

The main point here is that one needs to be careful in clarifying what one means with the word archive. Kate focuses on a few points, but one of the trickiest that comes up is the use of “archiving” as a verb or saying something “has been archived.” Archives are places, and similarly preservation is something that institutions do not something that is accomplished. Nothing is preserved, there are only things that are being preserved.  Nothing is archived, there are only things that are in archives.

If there is one thing we can count on it’s entropy. All material objects in the world are wearing down and degrading. Everything in the world eventually succumbs to it’s own inherent vices. At the end of the day the question is what traces of the world recorded on artifacts we want to commit ourselves to ensuring long term access to.

Posted in Uncategorized | Leave a comment

Designing Online Communities: Read My Accepted Dissertation Proposal

Wisdom of the Ancients: the web-comic-epigraph for my dissertation proposal, from XKCD

As of last monday, I have now successfully defended my dissertation proposal. In the context of my doctoral program, that means there is just one more hurdle to climb over to finish. I’m generally rather excited about the project, and would be thrilled to have more input and feedback on it (Designing Online Communities Proposal PDF). I would be happy for any and all comments on it in the comments of this post.

Designing Online Communities: How Designers, Developers, community Managers, And Software Structure Discourse And knowledge Production On The Web

Abstract: Discussion on the web is mediated through layers of software and protocols. As scholars increasingly turn to study communication, learning and knowledge production on the web, it is essential to look below the surface of interaction and consider how site administrators, programmers and designers create interfaces and enable functionality. The managers, administrators and designers of online communities can turn to more than 20 years of technical books for guidance on how to design and structure online communities toward particular objectives. Through analysis of this “how-to” literature, this dissertation intends to offer a point of entry into the discourse of design and configuration that plays an integral role in structuring how learning and knowledge are produced online. The project engages with and interprets “how-to” literature to help study software in a way that respects the tension that exists between the structural affordances of software with the dynamic and social nature of software as a component in social interaction.

What’s Next? 

At some point in the next year I will likely defend a completed dissertation. Places do dissertations differently, in my program the idea is that what I just defended is actually the first three chapters of a five chapter dissertation. So, at this point I need to follow through on what I said I would do in my methods section (to create chapter 4, results) and then write up how it connects with the conceptual context section (to create chapter 5, conclusions). So I should be able to grind this out in relatively short order.

At this point, I think this project should be interesting enough to warrant a book proposal. So I’ll likely start exploring putting together a book proposal for it in the next year as well. With that in mind, any suggestions for who might be interested in receiving a proposal on this topic are welcome.

Posted in graduate school, research | Tagged , , , , , | 3 Comments

Small Pieces Loosely Kludged: Peer Review and Publication in Math Scholarly Communication

I’m always interested to hear about how different scholarly communities are changing their communications practices. Things like PLOS One, and projects like PressForward are putting forward interesting and new models for when and where review happens and how we establish credibility and mark for quality. At the recent ScienceOnline conference I had the pleasure of chatting a bit with David Zureick-Brown, a mathematician and one of the founders of MathOverflow. Given how forward thinking much of the math community has been in this scholarly communication space I was thrilled to have a chance to pick his brain about similarities and differences between fields.

High Rates of Rejection in Math Journals: It works different there

I was initially taken back by something David said.  It was something like “In my field, if you aren’t getting at least a 50% rejection rate on papers you submit to journals you aren’t aiming high enough.” The idea being, that you should try to get your work into more prestigious journals, and many of these journals have two-year backlogs. In one situation, a paper was rejected that had largely positive reviews because it wasn’t important/exciting enough. This is exactly the thing that projects like PLOSone are set up to get around. To try and stop evaluating papers for quality and start doing a minimal evaluation of them as passing a minimum bar.

Publication happens before Publication

Initially I thought this sounds terrible! You submit your papers, wait for rejections, and then shift down a bit. Wouldn’t this hold up getting your work out there? But then I remembered that Math is different. At this point there is an expectation that you put all your work up on arXiv as soon as they are coherent enough to be called papers. So this review process wasn’t holding up the publication process. As soon as work is done it’s published. People start reading it on arXiv. When I realized this I suggested “Oh, so publication in a journal is actually really just like a mark of quality, it’s like a merit badge.”  Now, it’s a really important merit badge in the field, as the quality of the journals you are published in is a key factor for tenure and promotion. So getting a piece published in a particularly prestigious journal is effectively a seal of quality/approval that a given work matters and is significant for the field.

Small Pieces Loosely Kludged

This kluged together system seems like a great outcome. I can’t imagine anyone set out to make this system work this way. Anything can get published on arXiv, at which point anyone can see the work, cite the work, and reference it. The journals are now really just serving as amplifiers. The peer review of this work is actually post publication peer review. In this system it sort of doesn’t matter if journals want to become open access. If they let you put up pre-prints you’re good to go. The content of the journals is already published and open access. It only costs folks money to see the papers if they want to see the fancy PDFs.

It’s largely about when you call it a publication

So post publication peer review and pre-publication review are actually much more dependent on what we call the publication. Humanities and Social Science folks can just start to put all their stuff up in places like, or up on SSRN before submitting it. In many social sciences at this point this is a standard practice.  While I’m a big fan of institutional repositories, I find the situations where the field specific platforms have emerged a bit more exciting. In these cases, the expectations and behaviors of scholars have shifted. It’s the norm to expect that you can see your colleagues work as quickly as it’s come together online in these spaces.

So why doesn’t this happen in History and the Humanities?

The fact that arXiv, SSRN and sites like RePEc and a few other disciplinary networks emerged for sharing scholarship in draft form and that nothing like them has taken off in the humanities is an indictment of the humanities. How come Mathematicians, Astronomers, Economists and a range of other fields could just set up places to share their work and humanists haven’t? As you can see from the Math situation, if a scholarly community just shifts to sharing pre-prints and everybody does it then it basically doesn’t matter what publishers want to do in terms of open access. This is to say that scholars have no one but themselves and their peers to point to if they don’t like how scholarly communication works. As the math case shows, we can patch our scholarly communication system one kludge at a time and end up with a system that embraces broad open access and rapid dissemination and retains merit badges for quality.


Posted in Uncategorized | 1 Comment

Front Lines: Early-Career Scholars Doing Digital History… Virtual AHA Panel Participation

I may not be at AHA 2013, but that won’t stop me from participating on a panel. Below is a series of videos I created for an AHA 2013 panel. “Front Lines: Early-Career Scholars Doing Digital History.” Each video responds to a prompt for discussion. Both Miriam Posner and I are virtually participating, so I will be interested to hear how it ends up working out in meatspace. For those of you who stay up late, you can see me participate in the panel before it actually happens.

For starters it is probably a good idea for each of us to describe what it is we actually do and why we think what we do counts as digital history.

What is relationship between your digital work and your larger body of historical scholarship?

How have digital projects changed your approach to degree requirements, publishing, promotion, (and tenure if relevant)?

Looking back at your education and training (both formal and informal) what are some of the most important experiences, the things that set you up with the skills you need to land the job you have?

What kinds of resources can institutions offer to early-career digital historians (especially institutions that are not home to DH centers)? Where can digital historians find important communities/resources outside of their institutions?

Here is the abstract for the session:

Front Lines: Early-Career Scholars Doing Digital History

Digital history’s growth in popularity has been accompanied by anxiety about how, and whether, these new methods and their practitioners will fit into traditional history departments. At the 2012 meeting of the American Historical Association, discussions of digital history often turned to questions about graduate education, the job market, publication, and promotion. This roundtable aims to approach these questions head-on, relaying experiences and recommendations from early-career scholars navigating these transitions.

Digital historians who elect to enter the professoriate often find themselves faced with a number of questions related to credentialing, tenure, and promotion. Many digital projects, for example, require publication venues other than the bound monograph. What sorts of avenues exist for digital publications? Will tenure committees be prepared to accept and evaluate these nontraditional projects? How many universities can be expected to offer the infrastructure and resources digital historians need?

The AHA’s leaders have suggested that for new Ph.D.s, one solution to the jobs crisis may lie in seeking careers outside of the professoriate — an option that digital historians have been particularly interested in pursuing. How can graduate students gain the experience to prepare themselves for these positions? If new Ph.D.s turn to these alternative academic careers, what can they expect? Can a historian in a nontraditional career expect to pursue a research agenda? What are these alternative jobs, and how well are new Ph.D.s adapting to them?

In this roundtable, a group of digital historians, in jobs both on and off the tenure track, will take up these questions, drawing on their own experience to suggest how we can prepare young digital scholars to enter various job markets, and how we can prepare employers to receive them.

Posted in Uncategorized | 2 Comments

2012 Year in Review: Digital History, Digital Cultural Heritage, and the Born Digital History of Science

Looking back on this year makes me exhausted. It looks like I managed to put up 34 posts on The Library of Congress Digital Preservation Blog as well as 11 posts on Play the Past and 24 posts here on my own blog. Seven different things I wrote ended up churning their ways through the process of becoming journal articles or book chapters, and by my count I was involved in 12 conferences (4 of which I was involved in planning). All of that led me to make the face below.

Photo of me from the OSI Newsletter

What follows is my attempt to make sense of it all and provide anyone interested in an overview of what I’ve been up to with a run down. Looking back over what I have gotten into this year I think I can (broadly speaking) fit most of what I have worked on into one of two buckets, digital strategy for cultural heritage organizations and work trying to further advance digital history.

Digital Strategy for Cultural Heritage Organizations
Earlier this year i had a chance to interview Michael Edson from Smithsonian for the LC blog. In working up one of my questions for that interview I think I’ve found one of the central questions that much of my work responds to.

Where do you think the home should be for digital media in a cultural heritage organization? Or, how do you think one should divide up roles and responsibilities when digital is increasingly becoming a key part of nearly every part of cultural heritage organizations? We are increasingly acquiring, preserving and exhibiting born-digital and digitized materials, using social media for outreach and public relations, supporting researchers and fielding reference questions through digital channels, and supporting all of that work with a substantive IT infrastructure. Who should be whom’s ramp and loading doc?

I was thrilled to have the opportunity to forward my own answer to this question when I was invited to keynote the Connecticut Digital Initiatives Forum. I think some of the features of the digital makes it possible to apply a lot of the ideas that have come out of the open source software movement into how we do a lot of other work. I called this  Do Less More Often An Approach to Digital Strategy for Cultural Heritage Organizations. Everybody is trying to do too much at once. Find the low hanging fruit and pick it. Get the boxes off the floor. Release early and release often. Put things out there and find out how you should be doing things. I think this idea cuts across all parts of digital cultural heritage work. Everything from, collecting, processing, arranging, preserving, making available, and exhibiting can be re-framed in this mindset.

As an example, alongside this year’s Digital Preservation Conference I helped to facilitate CURATEcamp processing. An unconference focused on bringing notions of archival processing and computational processing. The event itself (minimally planned and programmed and participant driven) to me, exemplifies do less more often. At the same time, some of the great work on applying More Product, Less Process for Born-Digital Collections and Born Digital Minimum Processing and Access are also great fits in that they become ways to think about iteratively structuring work. A similar iterative approach is evident in the NDSA levels of digital preservation project. Which went from a concept to a release candidate over the course of the year.

Another big area of strategy I did a good bit of thinking and writing about this year was crowdsourcing. You can see a recap of most of my Crowdsourcing Cultural heritage posts here.

Advancing Digital History: Practices, Tools, and Data

This year I wrote a bit about how historical research is changing as a result of digital tools, I worked on building and designing a tool for historians, and I was thrilled to be able to participate in ongoing conversations about how historians thinking about

I was excited that Fred Gibbs and I’s essay Building Better Digital Humanities Tools: Toward broader audiences and user-centered designs made it’s way into Digital Humanities Quarterly earlier this year.  I was also thrilled to see that a lot of the things we found about how historians were approaching digitized source material were similar to what ITHAKA found in their recent study of historians research practices.

Keeping these ideas in mind, I was thrilled to work and write about the ongoing work on Viewshare. Jefferson Bailey and I wrote From Records to Data with Viewshare: An Argument, An Interface, A Design. Bulletin of American Society for Information Science and Technology, Viewshare: Digital Interfaces as Scholarly Activity for Perspectives on History  and in collaboration with Lauren Algee wrote Viewshare and the Kress Collection: Creating, Sharing, and Rapidly Prototyping Visual Interfaces to Cultural Heritage Collection Data for D-Lib.

I was also happy to be able to respond to the Joint Conference on Digital Libraries panel on the Digging into Data grant project in One Culture: Digital Collections, Computational Humanities and History at Scale and got a good bit of traffic to my blog for my post on how Discovery and Justification are Different. in digital history research. Oh, and Defining Data for Humanists: Text, Artifact, Information or Evidence? based on a blog post of mine from last year was published in the Journal of the Digital Humanities. I see all of these changes coming together to result in changes as to how historians talk and think about sources as data. On that, Fred Gibbs and I finished an essay called The Hermeneutics of Data and Historical Writing.

Born Digital Primary Sources for Historical Research
One of my biggest projects this year was planning, running and working on the report for Science at Risk: Toward a National Strategy for Preserving Online Science. We set out asking “what kinds of online science content will invaluable for understanding science in our age?” and I think we came to some valuable answers and calls to action.

I was happy to see my essays, Tripadvisor rates Einstein: Using the social web to unpack the public meanings of a cultural heritage site and Teaching intelligent design or sparking interest in science? What players do with and take away from Will Wright’s Spore published this year. Both are attempts to explore the kinds of research we can do when we work from born digital primary sources from the open web. These two pieces focused on materials on the web, but I did a bit of writing about some other sources as well.

In The is of the Digital Object and the is of the Artifact I explored some of the nitty gritty details and waxed a bit philosophical about digital objects and artifacts. I see this kind of perspective as relevent to some of the writing and interviews I did around software and video game preservation. Both topics I am excited to be more involved with in the future. On this topic see, Yes, The Library of Congress Has Video Games: An Interview with David Gibson, Exhibiting Video Games: An interview with Smithsonian s Georgina Goodlander ultimately, oh, and I also gifted my own video games to the Library and lastly, this start to a list of work on software preservation Preserving.exe: A Short List of Readings on Software Preservation

Where do we go from here?
So I think I’ve had a productive year. I imagine most of these threads will continue into the new year, but I am also excited about the prospect of getting involved in some other new and exciting projects both at LC and on my own.


Posted in Uncategorized | Leave a comment

Implications for Digital Collections Given Historian’s Research Practices

The new ITHAKA report, Supporting the Changing Research Practices of Historians is something that everybody working with cultural heritage collections should read. It’s full of good stuff, but in my opinion the key finding is that Google is now (by and large) the first step in historical research. Fred Gibbs and I reported on nearly the same finding in our recent paper on digital tools for historians. The Google search box is the first place historians go when they start their research, it plays a key role in their discovery process. This is particularly true for idiosyncratic terms, phrases and people’s names which often turn up results from Google books. So, the next time someone tells you that they want to make a “gateway” a “portal” or a “registry” of some set of historical materials you can probably stop reading. It already exists and it’s Google.

The report makes some suggestions for what libraries and archives should do to help make their materials more accessible. Namely, that they work to integrate them with discovery tools and that they do what they can to make more finding aids accessible online. Both of these are valuable, but I think both goals fail to fully integrate the finding about Google and Google Books. If a library, archive, or museum wants its resources to be found as part of the discovery process, the initial phase of theory development, they need to be thinking about how they get their materials (or information about their materials) to show up in Google search results.

Are more and bigger online finding aids really an answer?

The report suggests that we cultural heritage organizations should be getting more finding aids up. That’s great, that would be useful. However, given the finding about Google, I think an even bigger potential lesson here is that if you want your collections to be used by researchers (digital or otherwise) the first thing you need to think about is not finding aids but about making web pages about items, boxes, collections, etc that will be discoverable in Google. In short, I would rather see a well-structured web page with a well-chosen title and persistent URL before one even begins to make a finding aid. This is not about SEO, it’s about doing very simple things that make for better HTML pages. Importantly, if an org makes a single PDF out of a finding aid for a collection and puts it on the web that finding aid is almost useless as far as Google is concerned.

What would finding aids look like if they assumed the existence of the web and web search?

To me this begs a rather controversial question. If the goal of the finding aid is to help researchers find things and the way they do that is to search Google (which is really good at looking for particular things in HTML pages) then why is the HTML page a byproduct of the EAD XML finding aid and not the primary thing that the archivist authors? We designed an infrastructure around EAD and found ways to make that into HTML pages, but in the meantime Google came around and historians found out that Google was such a more useful and powerful way to search that they only consult the finding aids to round out the ideas they have already started developing. So, what would minimal archival processing for access look like if we thought first about creating an HTML web page for every collection or every box?

Posted in Uncategorized | 14 Comments

Seeing With Cinimagram

I’ve been dabbling a bit with Cinamagram this week. It’s a free app that lets you create Cinamagraphs. Their tagline is “Create a stunning hybrid between photo and video”  and it does a nice job at letting you create something that does just that. It’s done a nice job of getting me to see my walk to and from work a little bit differently.

You record short 2 second videos and then draw a mask on the photo to identify the part of the image you want to be animated. The rest of the image stays still. The end product is an animated gif. For example, in the image above I set it to keep counting down at the end of the walk signal. You’re always just about to have the light switch to red.

It’s an interesting process. It get’s you to see spaces in different ways. It’s fun to look for things that can run as repetitive motions in scenes where a lot of other action is held still. For example, getting things like the car in the image to blur from motion while keeping the lightly flapping flag going.

It’s tricky to get them to pan out exactly right. But it is a lot of fun to try and find things that you can play back and forth with.

By focusing in on very little movements, like rustling leaves or the lights on a police car you end up with things that have this strange quality of being something between a photography and video. Aside from being neat, it’s rather easy.

I think it’s always fun to get a new toy like this that prompts you to look around at the world a little differently, to try and see with a different eye.


Posted in Uncategorized | 1 Comment

Discovery and Justification are Different: Notes on Science-ing the Humanities

Computer Scientist: “You can’t do that with Topic Modeling.”

Humanist: “No, I can because I’m not a scientist. We have this thing called Hermeneutics.”

Computer Scientist: “…”

Humanist: “No really, we get to do what we want, we read texts against each other, and then there is this hermeneutic circle grounded in intersubjectivity.”

Computer Scientist: “Ok, but you still can’t make a claim using this as evidence.”

Humanist: “I think we are going to have to agree to disagree here, I think we have different ideas about how evidence works.”


While watching the tweets from the Digital Humanities Topic Modeling meeting a few weeks ago I started to feel the above dialog play out. I wasn’t there, and I am not trying to pigeonhole anyone here. I’ve seen this kind of back and forth happen in a range of different situations where humanities types start picking up and using algorithmic, computational, and statistical techniques. What of all this counts for what? What can you say based on the results of a given technique? One way to resolve this is to say that humanists and scientists should have different rules for what counts as evidence. I am increasingly feeling the need to reject this different rules approach.

I don’t think the issue here is different ways of knowing, incompatible paradigms, or anything big and lofty like that. I think the issue at the heart of this back and forth dialog is about two different contexts. This is about what you can do in the generative context of discovery vs. what you get can do in the context of justifying  a set of claims.

Anything goes in the generative world of discovery
If something helps you see something differently then it’s useful. If you stuff a bunch of text into Wordle and see a word really big that catches you by surprise you can go back to the texts with this different way of thinking and see why that would be the case. If you shove a bunch of text through MALLET and see some strange clumps clumping that make you think differently about the sources and go back to work with them, great. You have used the tool to spark a different way of seeing and thinking.

If you aren’t using the results of a digital tool as evidence then anything goes. More specifically, if you aren’t trying to attribute particular inferential value to a particular process that process is simply producing another artifact which you can then go about considering, exploring, probing and analyzing.  I take this to be one of the key values of the idea of “deformance.” The results of a particular computational or statistical tool don’t need to be treated as facts, but instead can be used as part of an ongoing exploration. With this said, the moment you turn from exploration and theorizing to justifying an interpretation the whole game changes.

Justification is About Argument and Evidence
If you want to use something as evidence then it is really important that you can back up the quality of that evidence in supporting the specific claims you want to make. In the case of topic modeling, you need to make judgment calls about how many topics to look for, and you make the call about which texts from which sources go into the mix to generate your topics. If you want to talk about these topics as evidence to support particular inferences then you better be able to justify your reasons for those decisions, or be able to explain what you did with your data to warrant the interpretation you are forwarding. You are going to also need to explain how different decisions for different inputs could have resulted in different results. (I am mostly going off of the discussion in and around Ben Schmidt’s When you have a MALLET, everything looks like a nail.

The net result here, is that if you want to use the results of something like topic modeling as evidence you really need to have a good understanding of exactly what you can and can’t say based on how the tool produced your evidence. Importantly, there are a lot of different roads to go down when you start working with data as evidence, but in any event, you do need to be able to justify your decisions and defend against alternative explanations. Ultimately  this is where validity of inferences lives. Validity is always about the quality of the inferences you draw and your ability to defend against alternative explanations.

It’s the Scientists that Realized they were Humanists
At the heart of this remains some issues around what it means to do the humanities or to do science. (Fred and I got into this a bit in our Hermeneutics of Data essay).  I still hear this persistent fear of people using computational analysis in the humanities bringing about scientism, or positivism. The specter of Cliometrics haunts us. This is completely backwards.

Scientists, at least the sharp ones, have given up on their holy grail. They have given up on the null hypothesis. The sophisticated ones have realized that what they do is really just argument and evidence too. When it comes to justification time, you need to carefully build an argument grounded in evidence and defend it against alternate explanations. If you want a great recent example of this sort of argument and evidence grounded in statistics I would suggest both Nate Silver’s Simple Case for Obama as the Favorite or if you want a natural science example, read about this paper on arctic sea ice. Both are great examples of defending against different interpretations of evidence.

What you can get away with depends on what you are doing

When we separate out the the context of discovery and exploration from the context of justification we end up clarifying the terms of our conversation. There is a huge difference between “here is an interesting way of thinking about this” and “This evidence supports this claim.” Both scientists and humanists make both of these kinds of assertions. In general, I think the fear of the humanities becoming scientific is largely based on an outmoded idea on the part of humanists as to what we have come to understand happens in science. At the end of the day, both are about generating new ideas and then exploring evidence to see to what extent we can justify our interpretations over a range of other potential interpretations.

Posted in Digital Tools, History | 21 Comments

Apparently When Girls Adopt Technology it Ceases to be Technology

I was excited to read Geek Masculinity and the Myth of the Fake Geek Girl. I saw the image macro at the top, and thought, “neat, another image macro like successful black man that turns stereotypes on their head.” Sadly, this is not the origin of what I came to find is called “Idiot Nerd Girl.”

Reviewing the Idiot Nerd Girl images is a little bit painful. Just another reminder of how far we all have to go. As I’ve suggested before, I think everybody gets to chose if they are a misogynist or a feminist and clearly these are produced by misogynists.

Setting that aside, I saw this one and just felt compelled to dig into one particular genre of these images. The one’s that define what gamers are and are not.

For reference  The Sims is the most successful video game. Ever. Of all time. Do you know why it is successful? There are several reasons. First, it’s amazing. The Sims is, by almost all accounts, an innovative and engaging game. Will Wright has described it as his greatest achievement. The Sims also succeed where so many games have failed. There are a lot of women who like to play The Sims. Now, there are women who like any and all games. However, in the case of The Sims, there were a lot of women who liked to play it. Importantly, it’s not that men don’t like playing The Sims. There are a lot of men playing The Sims. So it isn’t a game for girls or a game for women it is more accurately a game that is largely gender neutral in terms of audience.

So a girl likes to play The Sims. This apparently means it’s no longer a game, and she isn’t a gamer. Why? I bet there are a million reasons, (it’s not hard enough, or it’s not competitive, etc.) and I know all of them are trash. The Sims doesn’t count as a game (the logic that makes this image work) because a lot of women like it. That’s it. When girls take to technology in many men’s eyes that technology simply ceases to be technology. That’s the case now at least. It wasn’t always that way.

Science is for Girls and Classics is for Boys

Wait, I got that backwards. Right? We need to get more women involved in science! Yes, we do. But there was a time when this was all reversed. The same arguments folks use to support the idea that girls can’t do science were previously used to argue that they couldn’t cut it in classics.

In The Science Education of American Girls: A Historical Perspective historian of education, Kim Tooley, documents  ”the structural and cultural obstacles that emerged to transform what, in the early nineteenth century, was regarded as a “girl’s subject” into something that became defined as innately masculine. It is a great book. I highly recommend it. The essential point here is that all the reasons for why something is for girls and something else is for boys are basically meaningless.  Science was for girls until girls until it had social capitol. At that point, science had always inherently been something that boys were good at. (I’m being a little hyperbolic, but I think the point generally stands).

Idiot Nerd Girl is an Ideology

She is just the most current in a history. Hegemonic masculinity defines computing, defines science, defines whatever, as the things that women aren’t interested in doing. When women become interested in something, that thing either no longer counts (in a situation like The Sims) or the girls are just “pretending” and don’t actually get it. Blergh…

This kind of thing is often just below the surface. It is just so striking that each Idiot Nerd Girl image is such a clear textbook case of the contradictions on display. The meme makers are so unaware that they wear their contradictions on their sleeves.


how it works

Thankfully, at this point, it looks like Idiot Nerd Girl is being widely reclaimed.

Posted in feminism | 1 Comment