The new ITHAKA report, Supporting the Changing Research Practices of Historians is something that everybody working with cultural heritage collections should read. It’s full of good stuff, but in my opinion the key finding is that Google is now (by and large) the first step in historical research. Fred Gibbs and I reported on nearly the same finding in our recent paper on digital tools for historians. The Google search box is the first place historians go when they start their research, it plays a key role in their discovery process. This is particularly true for idiosyncratic terms, phrases and people’s names which often turn up results from Google books. So, the next time someone tells you that they want to make a “gateway” a “portal” or a “registry” of some set of historical materials you can probably stop reading. It already exists and it’s Google.
The report makes some suggestions for what libraries and archives should do to help make their materials more accessible. Namely, that they work to integrate them with discovery tools and that they do what they can to make more finding aids accessible online. Both of these are valuable, but I think both goals fail to fully integrate the finding about Google and Google Books. If a library, archive, or museum wants its resources to be found as part of the discovery process, the initial phase of theory development, they need to be thinking about how they get their materials (or information about their materials) to show up in Google search results.
Are more and bigger online finding aids really an answer?
The report suggests that we cultural heritage organizations should be getting more finding aids up. That’s great, that would be useful. However, given the finding about Google, I think an even bigger potential lesson here is that if you want your collections to be used by researchers (digital or otherwise) the first thing you need to think about is not finding aids but about making web pages about items, boxes, collections, etc that will be discoverable in Google. In short, I would rather see a well-structured web page with a well-chosen title and persistent URL before one even begins to make a finding aid. This is not about SEO, it’s about doing very simple things that make for better HTML pages. Importantly, if an org makes a single PDF out of a finding aid for a collection and puts it on the web that finding aid is almost useless as far as Google is concerned.
What would finding aids look like if they assumed the existence of the web and web search?
To me this begs a rather controversial question. If the goal of the finding aid is to help researchers find things and the way they do that is to search Google (which is really good at looking for particular things in HTML pages) then why is the HTML page a byproduct of the EAD XML finding aid and not the primary thing that the archivist authors? We designed an infrastructure around EAD and found ways to make that into HTML pages, but in the meantime Google came around and historians found out that Google was such a more useful and powerful way to search that they only consult the finding aids to round out the ideas they have already started developing. So, what would minimal archival processing for access look like if we thought first about creating an HTML web page for every collection or every box?