i tried using the get by id and inner html tags to do this but they wouldnt work![]()
how would i make a script that scans the page your own for a word, then shows a div with the amount of times that word is on that page?![]()
![]()
thanks for any help![]()
i tried using the get by id and inner html tags to do this but they wouldnt work![]()
how would i make a script that scans the page your own for a word, then shows a div with the amount of times that word is on that page?![]()
![]()
thanks for any help![]()
This is probably not the best way of counting words, but it is a solution.
Then you could make some other code to make the div that displays the word and the returned count.Code:function wordCount(word) { var HTML = document.getElementsByTagName("body")[0].innerHTML var text = HTML.replace(/<(style|script).*?>(.|\r?\n)*?<\/\1>/gi, "").replace(/<.*?>/gi, ""); var count = 0; var regex = new RegExp("(\\r?\\n| |\\.)" + word + "(\\r?\\n| |\\.)", "i"); //The regex above needs more checking, such as for quotes, question marks, etc. while(regex.test(text)) { count++; text = text.replace(regex, "$1$2"); } return count; }
With innerHTML, it's easy:It might be more difficult to do it properly with DOM methods, though.Code:function wordCount(word) { return document.innerHTML.toString().match(new RegExp(word, "gi")).length; }
Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!
Twey, your code would return extra word counts when a word is within another word, such as "ham" in "hammer". Also, it would look within tags that don't display text within them, such as SCRIPT or STYLE. Those are the only ones I can think of at the moment, but more could easily be added.
My idea to compensate for getting the exact word is to set delimiters at both ends of the word, and now that I think about it, the best way of doing that is to do /([^a-zA-Z]|^)word([^a-zA-Z]|$)/gi
Perhaps:
Code:function wordCount(word) { var HTML = document.getElementsByTagName("body")[0].innerHTML.toString(); var text = HTML.replace(/<(style|script).*?>(.|\r?\n)*?<\/\1>/gi, "").replace(/<.*?>/gi, ""); var regex = new RegExp("([^a-zA-Z]|^)" + word + "([^a-zA-Z]|$)", "gi"); return text.match(regex).length; }
Last edited by Trinithis; 06-03-2007 at 07:35 PM. Reason: To add the option that the word starts or ends the string
Hmm... probably better to do it with DOM methods. We're venturing into the realms of parsing HTML with regex here, which is never a good idea (e.g. there could validly be > or < characters within event handlers).Code:if(typeof Node === "undefined") var Node = { 'TEXT_NODE' : 3 }; function wordCount(word, caseSens, el) { el = el || document.body; var total = 0; for(var i = 0, e = el.childNodes; i < e.length; ++i) if(e[i].nodeType === Node.TEXT_NODE) total += e[i].nodeValue.match(new RegExp('\\b' + word + '\\b', "g" + (caseSens ? "" : "i"))).length; else total += wordCount(word, caseSens, e[i]); return total; }
Last edited by Twey; 06-03-2007 at 07:47 PM.
Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!
Bookmarks