Log in

View Full Version : The Open library Project



shachi
08-05-2007, 05:24 PM
Hello everyone!
The Open Library Project: http://openlibrary.org/

seems to have very neat features, but the feature that I am most interested is in the text-search feature(click on a book). As one can see it searches the book and highlights the text in the book, which is an image. How can such an effect be possible? :confused: Anyone have any idea how one can search an image for matching text? Just curious. :p

Thanks. :)

Twey
08-05-2007, 06:13 PM
The search doesn't work for me so I can't be entirely sure, but I'd suspect the highlighted page images are generated server-side, and the text of the book is also stored.

shachi
08-06-2007, 03:29 PM
The search doesn't work for me so I can't be entirely sure

Oh.


but I'd suspect the highlighted page images are generated server-side,

True, but how unless they've stored the text of the book as well.

I'd love to know how it's done without actually storing the text.

Thanks for your reply. :)

Twey
08-06-2007, 04:30 PM
and the text of the book is also stored.It is possible that it could have been done via a text-recognition algorithm.

shachi
08-07-2007, 11:04 AM
Well, that seems pretty reasonable, I wonder how they do it. What I'd love to see is a text-recognition algorithm in PHP. That'd be quite a challenge. :p

djr33
08-07-2007, 11:08 AM
I think it's stored serverside. That image may be rendered, as well. I doubt they are storing full images for every single page.
If you notice, the audio is just a recording of the book, so it's probably the same with text... just interrelated pieces.

shachi
08-08-2007, 06:57 PM
drj33: True, I completely missed that.