New Omeka Zotero Plugin, or “penut butter in my chocolate”

You know those reese’s commercials where two people crash into each other on a street corner. One eating a chocolate bar and the other gulping down handfuls of peanut butter right out of the jar. They collide and mix the peanut butter and chocolate together, and then realize how fantastic the combination is. Well the open source scholarly software equivalent of that happened today. Thanks to Jim Safley for the launch of the new Zotero Import Plugin for Omeka. He did a great job of explaining it on the omeka blog, but I wanted to take a few moments to explain why getting some Omeka on your Zotero and some Zotero in your Omeka is such a neat thing.

Zotero Just Became a Publishing Platform
There are a lot of scholars with tons of interesting materials inside their Zotero libraries. For example, I have 120 tifs of postcards from my book on fairfax county inside my Zotero library. Zotero’s website has become a great platform for sharing and collaborating with folks to build out those collections, but it’s not really a platform for publishing them. Further, it is definitely not a platform for showcasing the often fascinating image, audio and video files associated with those items. By instaling this plugin on an Omeka site and pointing it at the collection you want to publish you can quickly migrate the content. You can then play with and customize an Omeka theme  and push out a great looking extensible online exhibit.

Omeka Just Got A Tool For Restricting And Structuring Data Entry
On the other side, folks interested in building an Omeka archive just got a very potent way to manage building their collections. One of Omeka’s strengths is its highly flexible data model. It’s ability to let you create item types and manage data schemas is fantastic. With that said, there are times when you actually don’t want all of that flexibility. It can be a bit overwhelming, particularly when you have a large group of people trying to do data entry and add files. Now, if Zotero’s default item types work for your archive you can simply have anyone who is going to add to the archive install Zotero and join your group. In this capacity, Zotero becomes a drag-and-drop UI for adding items and files to an Omeka exhibit. Once everything is in you can simply import all the info into your Omeka exhibit.

Mining Old News For Fresh Historcal Insight

This week I had the honor of participating in the Library of Congress’ national strategy for digital news summit. The Library gathered together a diverse mix of corporate and public archivists, representatives from public and private foundations, and librarians to discuss the digital future of news. The conversations focused on both how to preserve born digital news and how to archive old news migrating into digital forms. I was honored to have a chance to bring in my perspective as a consumer of that archived news.

I gave a short presentation about some of the ways digitized historical news enables historians to ask different kinds of questions. I think the talk has some implications for both historians and digital archivists, so I thought I would share the gist of the talk here to continue the conversation we started at the meeting.

In my mind this contributes to ongoing discussions about the role that digital tools should play in re-framing conversations about historical methodology. Since the structure of the archive plays a significant role in the structure and character of the kinds of questions a historian can ask it’s crucial for historians to be involved in helping shape these archives.

A Use Case for Historical News: Marie Curie Visits America

On May 11, 1921, the world’s most famous female scientist disembarked from a long Atlantic voyage in New York City. For the ten weeks Marie Curie toured the United States, she was greeted as an international celebrity, according to the New York Times, the “biggest hit of any celebrity who has come to New York” for quite some time. Curie was greeted with speeches and fanfare in New York, Washington DC, Pittsburgh and Chicago, gracing major news papers several times a week. Less than a year after American women won the right to vote through the 19th Amendment, Marie Curie —the only noble laureate twice over and worlds most distinguished women of science— visited the United States. Last year I decided to explore how different periodicals reported on Curie’s visit. Analysis of coverage of her visit exposes divergent ideas about the place of women in American science, society and work emerging in the early twentieth century. For our purpouses, this case also exposes some of the transformational power  databases and digital tools present for  historical inquiry.

Asking A Database Historical Questions

Picture 1

It took me six seconds to find the 1512 references to Marie Curie in the entire history of the New York Times, the Atlanta Constitution, the LA Times, the Boston Globe, the Washington Post, the Chicago Tribune and the Wall Street Journal. Now this obviously saved me a ton of time, but the implications of this search are much deeper than this. Reading the entire history of these publications for mentions of Curie would not only be impractical, it would be impossible.

If I had wanted to explore press coverage of Curie in the pre-full-text search world, I would have selected a few key dates when I would expect her to have been mentioned, gone to the library, and rolled out the microfilm. I would have found many of these articles, but the time it takes to find them requires a larger upfront commitment to exactly what I intend to explore, and how I want to explore it. With search I have the ability to quickly get a feel for different questions in different queries while simultaneously uncovering mentions of Curie on editorial pages and in periods I would not have expected to find her mentioned.

Personal Archive Tools Exponentially Increase This Transformative Power

Repositories like Proquest historical News are powerful, and their ability to allow users to explore connections between items inside their collections has a powerful effect on the kinds of questions historians can ask about their contents, but that is just the surface of the potential these databases afford. With a tool like Zotero it is possible to aggregate materials from a variety of different sources and mine them in sophisticated ways for historical insights.

After I gathered the relevant items and fulltext PDFs from Proquest I pulled a similar search through Reader’s Guide Retrospective. While readers guide retrospective did not offer seamless integration with Zotero I was able to pull out structured data for hundreds of references, and with a few clicks had submitted inter-library loan requests for fulltext scans of the most relevant articles. When I received those PDFs I was able to simply drag and drop them into Zotero to store alongside the data. As I constructed my personal archive I was then able to turn Zotero’s search capabilities on the collection to explore interesting relationships between my data.

Zotero Library

Data fields carry unexpected potential

I created a variety of saved searches from criteria in my research data. Page numbers are included in this data for a specific reason, they are crucial for citation. Beyond that purpose, page numbers also represent an important statement about the objects in my collection. While all of the articles I discovered about Curie are relevant to my analysis articles on the frontpage of a newspaper are particularly relevant to questions about how Curie was presented to the public. This field in my database, the page on which each article can be found, was included to help people find the articles in citations, but it, like many other fields in my database, also communicates an historical significance.

Slide07

Facets of that significance can expose historical insight

Once I had isolated the frontpage stories about Curie I had the opportunity to further explore this subset of thirty or so articles. Zotero’s ability to visualize the collection in a timeline allowed me to quickly visualize the chronology of Curie’s appearances on the front page. From there I could use the “highlight” function to further explore the data. Based on my experience with discussions of Curie’s visit to America I decided to highlight the mention of cancer in titles, finding the word in a plurality of the frontpage studies leads to a particular historical insight.

Marie Curie’s contributions to science are impressive, but the connection between her work and a cure or treatment for cancer is tenuous. While the word cancer does not appear, in any significant fashion, across all the hundreds of article titles about her visit, it does show up in a significantly larger portion of the front page story titles. This provides tentative support for the notion that Curie’s work, and importance, was misrepresented in feminine terms, framing in the feminine role of healer instead of the masculine role of a scientist.

Slide08

Implications for history and digital archives

Implications for historical methods: While it is indeed possible to count these things out without these sort of tools, the ease at which I was able to mine a large set of documents for relevant information, and historical insight, has important ramifications. As far as I am concerned, the only way that historians can overcome the issues that arise from the problem of abundance of historical materials is to begin using tools for data analysis that allow for “distant readings” of texts. This can only be accomplished if some larger issues are observed in the creation and digitization of historical records and texts.

Implications for historical archives and databases: Exposing fulltext and coherent metadata is essential, building fancy repository specific visualizations and manipulations is extravagant.  What is going to matter to historians of the future is the ability to take your data, dump it onto a tool like Zotero, and use any number of analytical tools to explore that data in relation to information from other repositories. In that light, any fancy encoding and detailed levels of information you work into your resources is of limited use if that is not carried across into other spaces. We are not going to solve the problem of abundance by digging deeply into small sets of documents encoded in TEI, were going to get there with the metadata we have, dirty OCR and the emerging universe of entity extraction.

Distributed Research Tool Instruction: Think Interlibrary Loan for Training

The ever growing heap of neat digital research tools is simultaneously fascinating and problematic. Some of this stuff really has the potential to be transformational, to provide new avenues for scholarship, and teaching,  but the sheer quantity of tools makes it a bit difficult for scholars and teachers to know where to start from, and what to do when they have started. I am excited to see some of these research tools, like Zotero, becoming part of library instruction on various campuses, but the ever increasing quantity of tools suggests that the possibilities for the few instruction folks at any institution to inform their users about these tools is outpacing the ability for instruction folks to fold them into their offerings. While there are many other avenues for learning about these tools, documentation, screencasts, etc. there is a lot to be said about the sort of hands on instruction and thoughtfulness you get from instruction folks.

With just a little creative thinking I think we could work this out. By pooling instructional resources together much the same way that libraries pool their collections, we could develop a rich collective distributed instruction network that could function alongside existing instruction networks.  If folks are interested please leave comments. It’s also entirely possible that this sort of thing already exists, if so please take a moment to point me to it. Here are what I see as the potential advantages.

alarm-clizockMore Flexible Scheduling:

By pooling resources folks at libraries and other parts of schools involved in instruction can offer users a much more flexible schedule of instruction. If 15 campuses each offer 5 sessions on Zotero in this sort of pool students and faculty at each of their institutions now have access to 75 different sessions on Zotero instead of 5.

evil-geniusShare Exotic and Esoteric Research Tools:

Every instructional tech person I’ve met has a specialty. If there was this sort of distributed instruction network a Librarian in Kansas with an amazing way to use del.ico.us for immunology research who might not be able to fill a class on his campus could probably fill out the session with folks from a larger pool of students and researchers.

wireframe-draft-whateverConnect Existing Instruction Networks:

Even at individual campuses instruction on tools tends to crop up in all sorts of unexpected places. For example, at GMU the Center for Teaching Excellence, Writing Center, Campus Libraries, Instructional Technology Services alongside individual departments all offer different sorts of training. Beyond these differences GMU is spread across three different campuses, meaning that face to face classes in each of these cases are distributed across each campus.

So what would this Distributed Digital Tool Instruction Thingy Look Like?

I don’t have a clear vision here. I think there are several different directions something like this could develop. Here are three options as I see it.

Piggy Back on An Existing Service: There are now a multitude of free enough platforms for screensharing, live chat, sharing slides, and video conferencing. A system for this could simply piggy back on a service like WiZiQ, or DimDim. This senario would have zero upfront investment, and folks could just start this network inside one of these tools.

Stitch together a much more flexible network: Another approach would be to be to stitch together small tool agnostic set up. Everyone uses the system they are comfortable with and then just aggregates info on what sorts of instruction going on and then everyone posts what they are teaching on a collaborative calender.

Build Something More Coherent: Work up a more coherent custom platform for pulling all of this together. There are a lot of neat, more complicated, possibilities. For example a system could keep track of karma points for users from an institution and classes offered by folks from that institution.

Recap from first Triannual Zotero Trainers Workshop

Last week I had the pleasure of running the first in Zotero’s triannual (that’s three times a year) workshops for Zotero trainers (looking for a better name for “trainer”). I had a great time, and I think everyone left with a nice balance of practical next-steps for making Zotero work at their own institutions and rabid enthusiasm for the exciting collaborative features just around the corner. I also left with a slate of new ideas for resources I can develop to help them better make the case for Zotero at their institutions. If your interested in joining in on those ongoing conversations join our google group. I am currently hammering out the details for the second workshop, which will most likely take place Emory in Atlanta this July. Stay tuned for more details. Below are some pictures from the workshop.

We started with a somewhat exhaustive run-through of Zotero’s current feature set.

We then spent some time poking around under Zotero’s hood. Getting a feel for where and how Zotero stores data and attached files, how Zotero’s site translators work, and (pictured above) making minor edits to some of the CSL files Zotero uses to create bibliographies.

On day two we spent a bit of time analyzing a few different libraries approaches to developing their own Zotero documentation for their users and hashed out some best practices for connecting efforts to support Zotero at individual institutions with the existing Zotero support networks.

Export from Zotero to Librarything or Goodreads

One of Zotero’s many virtues is that it is a really robust container for bibliographic data. If you want to spend a little time playing with the Citation Style Language that Zotero uses it is actually pretty easy to get some useful data out of Zotero to do all sorts of fun things with. One of the most simple of which is exporting items for services like Librarything and ISBN numbers which each service then either grabs the data from Amazon, the Library of Congress, or just the existing pool of items that they already have available.

Gist:

  1. Install this CSL into your copy of Zotero.
  2. Create a biblography from all your books using the ISBN Export style
  3. Import the list to Librarything or Goodreads importers

Organize Books Inside Zotero

Before explaining how to export the books you’ll want to get a clean list of books you own. I tag all the items in my library that I own with the “I Own it” tag. From their it is easy to create an advanced search for all your books that have that tag.

Getting ISBN Numbers

Next use my nifty CSL file to export ISBN numbers. Just save this CSL file to your desktop and drag it into a open Firefox window, you should then be prompted to install the CSL. Once installed you will have ISBN Export as a option in the create bibliography menu.

This very simple export style underscores how easy it is to get started playing with CSL. The part of the style that does all the work is really just these few lines.

  <bibliography>
    <sort>
      <key variable="ISBN"/>
    </sort>
    <layout suffix="">
      <text variable="ISBN" prefix="" suffix="     "/>
    </layout>
  </bibliography>

The first part of this  <sort> sets list to sort by the ISBN number, and the second part,  <layout> tells Zotero that all we want is the ISBN without any characters as a prefix or a suffix.

Uploading Your File

From there all you need to do is upload your file. Both of Goodreads and Librarything have pages for uploading book information. While each service allows you to upload additional information my understanding is that that other info is only used in cases when the ISBN number for a given work was either missing or malformed.

Making Book Labels With Zotero

To the left you can see a sample of some of my labeled books. It may not be particularly pretty, but those labels do exactly what I wan them to do. Display information, have only a limited chance to damage my books, and cost me practically nothing. In this post I will walk through how I took my catalog of books from inside my Zotero collection, generated the labels, and went about sticking them on. This has been a bit more time intensive than I initially thought. It is really easy to export the tab-delimited file and make labels out of it, that only takes a few moments. The time consuming part was matching up the labels with my books. After the first batch I came up with a few ways to help speed up matching the books to their labels. So, if your following along at home this should work a bit quicker than it initaly did for me.

1. Install my ugly hack of a tab-delimited citation style.

I tweaked a existing style into this tab-delimited export. To install it just download it to your desktop and drag it into an open Firefox browser window. You should be prompted to install it.

2. Export your data using the tab-delimited style.

This part is easy, just right click on the items you want to export and chose the style you just installed. At this moment you have an opportunity to make life easier for yourself. Export smaller batches of items using tags you have assigned based on where the books are located in your house. It will only take a few seconds to tag all the books on the shelf in the guest bedroom with “location:guestbedroom”. Export the guest bedroom books in one batch. Then run through the rest of the steps.  When you print out the labels you can just go straight to the guest bedroom insted of wandering aimlessly throughout your whole house trying to remember which shelf you stuck Harry Potter and the Half Blood Prince on.

2. Check the exported file.

Remember, my export style is not fancy, in fact I called it ugly. Open up the file, consider sorting it by call number, double check that your data is there. I imported the file into excel to make sure it looked Ok, but you could use any kind of spreadsheet or database application to do this.

3. Doing the Mail Merge.

The next step is to merge the exported tab-delimited into a print ready document. I used Word’s Mail merge function and their standard address labels. It works a little different in different versions of word but the general concept is the same. You open the data manager, or whatever they like to call it, import data from our tab-delimited file, and then you just drag the data hunks into the labels with the order and spacing you want. Then you merge the data and the structure and send it to either a new document or a printer.

4. Print um and stick um.

Printing is easy, sticking them to your books is time consuming. If you like, you can pick up pre-sticky address label paper at target or a office supply store. I find these little guys to be more trouble than their worth though. In my experience some of the records inevitably print out of alingment with the diecuts, the ink smears when you touch it, they cause printer jams, and when you eventualy try to peal them off they leave nasty gunk behind. I chose to just print mine on regular paper, cut them apart with a paper cutter and addhere them to my books with scotch tape. If you broke your books into location based batches it should not take to long to stick on the labels. Once you have all the books labeled it is as easy as making sure the books are in the right order.

Cataloging Our Library in Zotero

Before starting my home cataloging project I really only used Zotero to grab individual items, this was my first time trying to look-up large numbers of items at once. I am happy to report that I came up with a work flow that let me run through about fifty books every ten minutes.

I found that the quickest way to pick up my books was throw in a few keywords from each book title, and if necessary a author’s last name into the Library of Congress’s basic keyword search. Most of these searches will then jump directly to the individual item record pages where you can grab the bibliographic record, including all the subject headings, call number, and other relevant info, with a single click.

To make sure the data looked good I kept Zotero part way open and switched out the fields shown in the middle column to show only the call number and the title, and to sort by the date added. With that configuration each new item I added ends up at the bottom of the list and each of the relevant pieces of information is right there for me to check. You can see what that looks like below.

So far I have pulled in about 350 of our books into my Zotero library, once you get the workflow going it moves pretty quickly. In short order I should have a digital copy of our entire library. In the next post I will explain how to get these items out of my collection as print ready labels.

Using Zotero as a Personal Library Catalog

Note: Not actualy my books image credit to Kristin Brenemen

My wife and I have a lot of books, tons of books. So many books that I am sometimes surprised to find books I didn’t even know we had. Over the years I have tried to organize them in ways that make sense to me. This approach has failed utterly and completely. I have now resolved to organize our library using the Library of Congress Classification system and I think I have the technology to make this relatively easy.

Below are the four steps I see to making this work. I have done some experiments and I am pretty confident that this will work. I intend to make detailed posts about each stage. So if anyone out their is as big a book dork I should leave detailed enough instructions for you to follow along.

How am I going to do this?

First I need to capture the bibliographic information for our books from the Library of Congress catalog into my Zotero collection. Since I already have about 100 of our books in my collection this should be relatively easy.

Then I need to export the Names and Call Numbers of all those books from Zotero. I should be able by hacking a simple custom bibliographic style. With any luck it won’t take long.

After that I will take that tab delimited file and use a mail merge to print the title’s and call numbers of my books onto address labels that I picked up at Target.

Why would I want to do this?

You might be thinking, why are you using Zotero for this? If all I wanted to do was organize my books by their LCCN’s I could just look them up and paste the call numbers into a Word document.

While this would be a way to go, doing this inside Zotero gives me the added benefit of having a really amazing iTunes like interface to find my books. I am also excited to see what the weighted tag cloud of all the subject headings for my books looks like.

One might also ask why not use something like Delicious library or Librarything? First, I’m cheap. Both of these services cost money where Zotero is free. Furthermore the various programs created to organize one’s local collections of books are set up do do just that and only that. There are some big plans for Zotero to do a whole lot more and I think it will be neat to have my entire collection of books in my Zotero library as those new features role out.

Edit: I meant to say LC clasification system, not LCSH.

Children's Books By The Numbers: Or Two Things I Learned From Franco Moretti

A few weeks ago I had the pleasure of reading Franco Moretti’s Graphs Maps and Trees. If you haven’t read it I highly recommend it as a truly compelling exploration of what individuals interested in the history of literature can glean by counting. After a bit of thought I am confident that some of his approaches will be quite useful in framing our understanding of children’s nonfiction.

As previously mentioned my project began in consideration of an anomaly of numbers. There are more Children’s books about Marie Curie than any other scientist. As a start to quantifying the history of science literature for children I thought it would be worth sorting out a bit more of who the popular stars are in comparison to the major players in biographies of scientists written for a more mature audience.

For a rough start I did some quick searches on the Worldcat for juvenile and non juvenile biographies about a laundry list of popular scientists and inventors and dumped the data at swivel.

Number of Children's Books About Different Scientists and Inventors

It appears that the same trend for gender in science is mirrored in race in invention. Curie is the most written about scientist for children, and George Washington Carver is the most written about inventor. But when we take the list of books for a older audience they fall far out of their top positions. What are we to do with this? The second thing I took away from Moretti is his insistence that we should be actively looking for questions we have no answer for. While this is essentially the same question I started my undergraduate thesis with I don’t really feel I am any more qualified to answer it.

Number of Biographies of Scientists and Inventors Written For An Adult Audience

I have a few ideas but I need to spend a bit more time fleshing them out. Stay tuned for more. In the mean time, what do you think could explain this phenomena? In the next few weeks I will post some of my thoughts on this and hopefully pull together some more robust numbers about these books. I am working on a way to export a CSV file from my Zotero collection that should help me isolate when Curie and Carver became the most written about scientist and inventor for kids

But in the mean time, why is there such a large market for children’s books about Carver and Curie for a young audience, and why does that market dry up when those children grow up?