penguins87
01-29-2010, 03:34 AM
I am trying to make a crawler for my website with PHP.
I got this code from a tutorial. Can you tell me how to use this function and loop to allow it to follow the links in my website?
<?php
function crawl($url) {
$html = file_get_contents($url);
preg_match("/<title>(.+)<\/title>/siU", $html, $matches);
$title = $matches[1];
$k = "<meta\s+name=['\"]??keywords['\"]??\s+content=['\"]??(.+)['\"]??\s*\/?>";
preg_match("/$k/siU", $html, $matches);
$keywords = $matches[1];
$d = "<meta\s+name=['\"]??description['\"]??\s+content=['\"]??(.+)['\"]??\s*\/?>";
preg_match("/$d/siU", $html, $matches);
$desc = $matches[1];
$rp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
preg_match("/$rp/siU", $html, $matches);
$links = $matches[2];
$info = array("url" => $url, "title" => $title, "keywords" => $keywords, "description" => $desc, "links" => array($links));
return($info);
};
?>
Thanks.
I got this code from a tutorial. Can you tell me how to use this function and loop to allow it to follow the links in my website?
<?php
function crawl($url) {
$html = file_get_contents($url);
preg_match("/<title>(.+)<\/title>/siU", $html, $matches);
$title = $matches[1];
$k = "<meta\s+name=['\"]??keywords['\"]??\s+content=['\"]??(.+)['\"]??\s*\/?>";
preg_match("/$k/siU", $html, $matches);
$keywords = $matches[1];
$d = "<meta\s+name=['\"]??description['\"]??\s+content=['\"]??(.+)['\"]??\s*\/?>";
preg_match("/$d/siU", $html, $matches);
$desc = $matches[1];
$rp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>";
preg_match("/$rp/siU", $html, $matches);
$links = $matches[2];
$info = array("url" => $url, "title" => $title, "keywords" => $keywords, "description" => $desc, "links" => array($links));
return($info);
};
?>
Thanks.