PDA

View Full Version : A gift from me to you!



Falkon303
01-16-2009, 12:34 PM
I might get grilled for this because it isn't utilizing DOM, so use at your own error output, but I am prepared to risk it for the sake of awesomeness.

*hides out of admin wraith fear*

So what I wrote is a function to find all values of a page very efficiently. It is very easily modified to pull data about any array of elements from within the body of a web page, although you must set the <body> tag to have the id "body" - <body id="body">. Currently it pulls all hyperlink href values.

This is the script/css for the header



<script type="text/javascript" >
function getcrawling()
{
// Begin crawling body element
bodycrawl = document.getElementById('body');
elementcrawl = bodycrawl.getElementsByTagName('a');
var scripts = new Array();
for (j=0;j<elementcrawl.length;j++)
{scripts[j] = elementcrawl[j].href + '<br><br>';}
// End crawling sub-elements
document.getElementById('acrawlingfool').innerHTML = scripts;
}
</script>
<style type="text/css">
<!--
#acrawlingfool {
position:absolute;
width:499px;
height:329px;
z-index:1;
left: 333px;
top: 106px;
background-color: #CCCCCC;
overflow:auto;
}
-->
</style>



This is the html for the layer named "iamacrawlingfool", that must be in the page somewhere.


<div id="acrawlingfool" name="acrawlingfool"><a onclick="getcrawling();" >click to craw</a>l </div>

I will try and set up a working example on my google page. :)

Falkon303
01-16-2009, 01:00 PM
http://Falkon303.googlepages.com/

On that page, the bottom entry explains how you can utilize javascript to tax servers, or perhaps just your own computer.

Nothing sinister, perhaps javascript blocks it, but what I do is dump page contents into a div layer. The page contents can include the reinitialization of the script, leading to an endless loop of resource pulling.... in theory. I am not sure of the restrictions of javascript, but if it's solid, this happens.

I just like using it to pull a hrefs and widths. Dunno if I'll every actually use that, but I imagine it could be extended to pulling input forms dynamically, or perhaps being refined to pull all words, break them into arrays... endless possibillities really (or maybe I am just super caffienated).

Twey
01-16-2009, 02:43 PM
Why are all those variables global?
Array.map = function(f, a) {
for (var i = a.length, r = []; --i >= 0; )
r[i] = f(a[i], i);

return r;
};

Array.foldl = function(f, a, t) {
for (var i = t === undefined ? (t = a[0], 1) : 0, n = a.length; i < n; ++i)
t = f(t, a[i]);
};

var getcrawling = (function() {
function withChild(parent, child) {
parent.appendChild(child);
return parent;
}

function divWrapHref(a) {
return document
.createElement("div")
.appendChild(document.createTextNode(a.href))
.parentNode;
}

function getcrawling() {
document
.getElementById("acrawlingfool")
.appendChild(Array.foldl(withChild,
Array.map(divWrapHref,
document.body.getElementsByTagName('a')),
document.createDocumentFragment()));
}

return getcrawling;
})();

jscheuer1
01-16-2009, 03:32 PM
Things like this are done all the time in scripts that must examine all a tags for an attribute and then assign events to them if they have it. It is nothing new, nor particularly well done in this case.

The new model for this sort of thing is to simply assign the event to the document. Then if one of the elements in question is in the DOM hierarchy of the target of the event and it has the selected for attribute, execute the code.

Falkon303
01-17-2009, 12:40 AM
jscheuer1, I am not a DOM expert yet. I am still learning DOM. :(

I moreso liked it because the code was short for combining a load of variables into one string. -

for (j=0;j<elementcrawl.length;j++)
{scripts[j] = elementcrawl[j].href + '<br><br>';}

I couldn't find that on the web anywhere...

Twey
01-17-2009, 02:49 AM
They probably won't be independent scripts because they're not very useful on their own, but they'll be all over the place as parts of other scripts.

If you just want to combine the hrefs into a string, the Array.prototype.join() function is perfect:
Array.map = function(f, a) {
for (var i = a.length, r = []; --i >= 0; )
r[i] = f(a[i], i);

return r;
};

var Operator = {
lookup: function(p) {
return function(o) {
return o[p];
};
}
};

function getcrawling() {
return Array.map(Operator.lookup("href"), document.links).join("\n\n");
}

jscheuer1
01-17-2009, 04:43 AM
Twey is absolutely correct. For some examples look at Lightbox and the many lightbox clones - just to name a few. From Lightbox 2.03a:




initialize: function() {
if (!document.getElementsByTagName){ return; }
var anchors = document.getElementsByTagName('a');
var areas = document.getElementsByTagName('area');

// loop through all anchor tags
for (var i=0; i<anchors.length; i++){
var anchor = anchors[i];

var relAttribute = String(anchor.getAttribute('rel'));

// use the string.match() method to catch 'lightbox' references in the rel attribute
if (anchor.getAttribute('href') && (relAttribute.toLowerCase().match('lightbox'))){
anchor.onclick = function () {myLightbox.start(this); return false;}
}
}

// loop through all area tags
// todo: combine anchor & area tag loops
for (var i=0; i< areas.length; i++){
var area = areas[i];

var relAttribute = String(area.getAttribute('rel'));

// use the string.match() method to catch 'lightbox' references in the rel attribute
if (area.getAttribute('href') && (relAttribute.toLowerCase().match('lightbox'))){
area.onclick = function () {myLightbox.start(this); return false;}
}
}

// The rest of this code inserts html at the bot . . .

However, in Lightbox 2.04:



updateImageList: function() {
this.updateImageList = Prototype.emptyFunction;

document.observe('click', (function(event){
var target = event.findElement('a[rel^=lightbox]') || event.findElement('area[rel^=lightbox]');
if (target) {
event.stop();
this.start(target);
}
}).bind(this));
},


Here (in Lightbox 2.04) we find an example of the new paradigm. Though in my opinion the script has several weak points, the above quoted section is a quantum leap in event initialization. It listens to the document. If a click on the document is also on a tag configured for the script, it invokes the code. This differs from from the previous (2.03a) example in at least two significant ways:


Only one event initialization is required, rather than looping through the document initializing every element that qualifies.

If other content with the Lightbox syntax is added to the document after the onload event, like via AJAX - it will be immediately recognised as qualifying for execution under the Lightbox code. In previous versions it would have had to be initialized after being added, or else it wouldn't execute as desired/expected.


Note: Lightbox uses the prototype.js script library, which is why the code quoted for it appears oversimplified and of course would not work in an environment not containing the prototype.js library.

Falkon303
02-04-2009, 11:17 PM
Very good to know. I like the "onclick" idea very much. I'll check out the crawling method posted as well. thanx!

- Ben