Sunday, August 07, 2011

Can You Spare Four Hours?

Do you have a few hours to spare for a good literary project?  If you can spare as few as (an estimated) four hours, you can help out by completing the "online text editing" of one of Charles Dickens's old magazines from the 1850s and 1860s.

The call for volunteers comes from Dickens Journals Online and all the details, along with how to join the project, can be found at there.  This bit is taken directly from the website and explains the need for volunteers and what all is involved.  Take a look and, if you are so inclined after reading this summary, head over to the website to join the project.  Honestly, this sounds more like fun than work, and I'm willing to bet that book bloggers all over the world are ready to jump in and help.
The OTC project began in earnest in January 2011, and for the first time we have given limited public access to DJO. We need to correct about 30,000 journal pages, not including the Household Narrative. This is a bit too much for our small team of three, and we desperately need volunteers! We'd love to complete this project before the official launch of the DJO site in 2012 — and can do so with your help.


A quick technical review will help the reader understand the need for text correction: we store two files (or records) for every journal page, one file being a facsimile of the original page stored as an image file (in ".jpg" format), and the other being a text file, similar to this web page. The text file was produced by applying optical character recognition (OCR) software to the image file, where the accuracy of each process depends on the quality of the image file.


All though the image files were created using a state-of-the-art scanning device, the quality of the original journal pages varied and some contained paper folds, smudge marks, transparency, etc. and as a result the text files contain a number of errors that vary from file to file. This is the main dilemma that we are trying to correct. A secondary problem, relatively trivial, is that the text file contains unwanted information and styling, which can also be corrected at the same time as the actual mistakes.


We have decided to make a magazine, typically 24 pages long, the smallest unit of contribution and as a result we will have 1,101 units of work at the end of the day. So if we find around 1,000 volunteers to take on 1 or 2 magazines each, we will reach the target between us. We reckon that with a typical magazine, it will take about 10 minutes to review and correct each page = 240 minutes or 4 hours' work). Please pass the details to friends you think might be interested.
If any of you do sign-up, please let me know how it goes for you. I'm going to register this morning myself.

2 comments:

  1. What in interesting little project. I've read a few of the magazine. They are interesting from a historical point of view, but not all that great from a literary point of view.

    I will at least check out the website, myself. I wonder if I have four hours to spare before school starts up again.

    ReplyDelete
  2. I signed up James but when I went back to the website today there was a message about some damage they have suffered - supposedly they will be back online in a day or so.

    ReplyDelete

I always love hearing from you guys...that's what keeps me book-blogging. Thanks for stopping by.