Readability of Scanned Books in Digital Libraries
Category: Digital Libraries
The wonk factor is only outweighed by the coolness factor here: the Conference on Human Factors in Computing Systems, held in Florence, Italy, has released its proceedings. And what I'm really excited about is the article "The Readability of Scanned Books in Digital Libraries." You can download the whole thing, but here's the abstract:It's about time someone studied this and measured it!!! I remember last year's BEA where Cliff Guren was showing some differences in competitors' scanning efforts (ahem) and Microsoft's; that was very eye-opening, and made me wonder about kids' books and all the illustrations. Anyway, cool beans!Displaying scanned book pages in a web browser is difficult, due to an array of characteristics of the common user's configuration that compound to yield text that is degraded and illegibly small. For books which contain only text, this can often be solved by using OCR or manual transcription to extract and present the text alone, or by magnifying the page and presenting it in a scrolling panel. Books with rich illustrations, especially children's picture books, present a greater challenge because their enjoyment is dependent on reading the text in the context of the full page with its illustrations. We have created two novel prototypes for solving this problem by magnifying just the text, without magnifying the entire page. We present the results of a user study of these techniques. Users found our prototypes to be more effective than the dominant interface type for reading this kind of material and, in some cases, even preferable to the physical book itself.