PDA

View Full Version : The Open library Project


shachi
08-05-2007, 06:24 PM
Hello everyone!
The Open Library Project: http://openlibrary.org/

seems to have very neat features, but the feature that I am most interested is in the text-search feature(click on a book). As one can see it searches the book and highlights the text in the book, which is an image. How can such an effect be possible? :confused: Anyone have any idea how one can search an image for matching text? Just curious. :p

Thanks. :)

Twey
08-05-2007, 07:13 PM
The search doesn't work for me so I can't be entirely sure, but I'd suspect the highlighted page images are generated server-side, and the text of the book is also stored.

shachi
08-06-2007, 04:29 PM
The search doesn't work for me so I can't be entirely sure

Oh.

but I'd suspect the highlighted page images are generated server-side,

True, but how unless they've stored the text of the book as well.

I'd love to know how it's done without actually storing the text.

Thanks for your reply. :)

Twey
08-06-2007, 05:30 PM
and the text of the book is also stored.It is possible that it could have been done via a text-recognition algorithm.

shachi
08-07-2007, 12:04 PM
Well, that seems pretty reasonable, I wonder how they do it. What I'd love to see is a text-recognition algorithm in PHP. That'd be quite a challenge. :p

djr33
08-07-2007, 12:08 PM
I think it's stored serverside. That image may be rendered, as well. I doubt they are storing full images for every single page.
If you notice, the audio is just a recording of the book, so it's probably the same with text... just interrelated pieces.

shachi
08-08-2007, 07:57 PM
drj33: True, I completely missed that.