PDA

View Full Version : Search option for extracting content in word document



hemi519
09-14-2011, 05:33 AM
Hi All,

I just want to know if it is possible to extract the specified search keyword from doc file. I am having a website and i am having many doc files(resumes) in that. Now what i want is if i type JAVA in my search box i need to get all the documents that are having JAVA string in the doc files. If it is possible let me know how to get it.

JShor
09-14-2011, 04:07 PM
Yes, but if you have a lot of documents, it may take a while to process the search. You would have to loop through all of your doc files, decode them to get their text, and do a regular string REGEX search in the document. If the search returns true, the doc file contains the string you're looking for, and based on that you can return your data.

I know this is a pretty vague response, but it is possible.

Antiword is a library that will allow you to read .doc files:
http://www.winfield.demon.nl/