Book Scanner 1: Define the mission.

For the last several years I’ve been struggling with finding a method to get printed material into digital form. Of course, like everyone else, I’d much rather read books and magazines in their original paper format. That’s great, until you start spending more than a dozen days a month on the road. When that happens, you start to realize just how heavy books and magazines can be. When I’m on the road, I carry my laptop, and I’ll typically have 5 or 6 magazines tucked into my bag, and maybe a book or two, but in no case anything larger than a trade paperback, or a hardcover if it’s in a relatively compact form factor.

This is fine until I finish the book (and I’m then carrying it around for a while), or realize that I’ve read all the magazines, or inspiration strikes, and I want to read over something that I first saw several months or years ago. This is when things get frustrating.

I also have a significant collection of back issues of various magazines, generally on topics like model-building, woodworking, and other topics for which older information is more often than not still just as useful as it was at the time of publication. It would sure be nice to have all that information available in digital form (maybe even OCR’ed), for easy access and ultimately more secure archiving. I would feel a lot more comfortable about having a database that is backed up and mobile, rather than depending on the forbearance of rodents and mildew to spare my printed magazine collection in its basement storage location. I also like my chances of actually finding the thing I’m looking for a lot better if I know it’s somewhere on the hard disk, rather than somewhere in the house. You get the idea, I’m looking for a good scanning solution.

The tried and true flatbed scanner worked okay, as did the all-in-one that I currently have. Trouble is, they are designed to scan single pages, and not bound material. These scanners are also slow, and require both a lot of time and attention. Either one of those by itself would be fine, but it’s not realistic for me to spend an hour or so with each issue, then another hour or so going over the inevitable artifacts of flatbed scanning (mainly deskewing each image, which means rotating each page image until the text is going perfectly right to left, with no uphill or downhill drift). I know; I tried.

The goal: a scanning solution that is relatively quick and easy, at least for the data acquisition phase. I can spend a little more time on post-production, as I can do that on the road on the laptop. Of course, getting a good, mostly automated work flow would be even better. Cost is certainly an issue. I’m not interested in a giant machine that will suck up gobs of money, space and electricity. I’d love for this to be a homebrew, DIY project!

So I was surfing around the instructables.com site and discovered this project, and perhaps more importantly, this website. This could well be the answer to my problems! It’s a DIY book scanner! Broken down, the scanner consists of a cradle, into which a book is placed. A glass or acrylic platen is placed over the book, and the combination holds the book open at a 90 degree angle. Two cameras, mounted on either side of the scanner take a picture of the two pages. What happens with the photos after that is a software question, but it seems most people turn them into .pdf files. Very spiffy, and WAY faster than the all-in-one. So, lots of attention, little time. That fits into my calculation! Users claim they can top 300 pages per hour, which means something on the order of roughly 15 minutes to scan one of my old model railroad magazines. Of course, there is post-processing involved, so it may add to the time required, but I’d be doing that even off of the all-in-one.

As something of an aside, I also stumbled across this project, which has some similar goals. Unfortunately, I think this design is a little more oriented toward scanning text-only books, rather than books or magazines that are more image-intensive. I also think that this design lends itself more to low-volume scanning, and would get pretty exhausting after a few hundred pages. So there it is.

In any case, the commercially available solutions for this kind of scanning cost thousands of dollars, and after the expense of a couple of low-cost cameras, I’m pretty sure that I have almost all the materials I’ll need, and for what I don’t a quick trip to the Bi-Mart and the Ace Hardware that are both within a couple blocks of my house should take care of it.

