Results 1 to 7 of 7

Thread: Extract links from a external webpage

  1. #1
    Join Date
    Jan 2007
    Posts
    25
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Arrow Extract links from a external webpage

    How do I extract links from an external webpage and display them with php?

    Example:
    I would have php extract all the links from: http://www.php-mysql-tutorial.com/
    and display the links in the php file.

  2. #2
    Join Date
    Jul 2006
    Location
    Canada
    Posts
    2,581
    Thanks
    13
    Thanked 28 Times in 28 Posts

    Default

    Sorry, but as far as I know of, you can't extract/open anything from an external webpage.
    This would be the logical way to do it:
    PHP Code:
    $page file("http://www.php-mysql-tutorial.com/");
    foreach (
    $page as $num => $lines) {
        
    $links strpos($lines,"<a");
        
    $linksend strpos($lines,"</a>");
        echo 
    substr($lines,$links,$linksend+3);
        } 
    But you can't get contents from external pages... I'll keep looking, but I don't think it's possible.
    - Mike

  3. #3
    Join Date
    Jan 2007
    Posts
    25
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    Quote Originally Posted by mburt View Post
    Sorry, but as far as I know of, you can't extract/open anything from an external webpage.
    This would be the logical way to do it:
    PHP Code:
    $page file("http://www.php-mysql-tutorial.com/");
    foreach (
    $page as $num => $lines) {
        
    $links strpos($lines,"<a");
        
    $linksend strpos($lines,"</a>");
        echo 
    substr($lines,$links,$linksend+3);
        } 
    But you can't get contents from external pages... I'll keep looking, but I don't think it's possible.
    Thats exactly what I wanted. Thanks.
    Is it possible to make one that only extracts certain links in a folder.
    ie: one that would extract all links in the folder "jump"

  4. #4
    Join Date
    Jul 2006
    Location
    Canada
    Posts
    2,581
    Thanks
    13
    Thanked 28 Times in 28 Posts

    Default

    Sure. Edit the search query:
    PHP Code:
    $links strpos($lines,"<a href=\"subfolder here\\"); 
    - Mike

  5. #5
    Join Date
    Jun 2005
    Location
    英国
    Posts
    11,878
    Thanks
    1
    Thanked 180 Times in 172 Posts
    Blog Entries
    2

    Default

    For a more robust solution, the PHP5 DOM module is capable of parsing HTML. You can then use getElementsByTagName() to find all the links in a document.

    If you have PHP4, you'll need to use php-html. Then,
    Code:
    <?php
      include('htmlparser.inc');
    
      $urls = array();
      HtmlParser remotePage = new HtmlParser(file_get_contents('http://www.example.com/some/page.html'));
    
      while($remotePage->parse())
        if($remotePage->iNodeType == NODE_TYPE_ELEMENT && strtolower($remotePage->iNodeName) == "a" && isset($remotePage->iNodeAttributes['href']))
          array_push($urls, $remotePage->iNodeAttributes['href']);
    ?>
    You'll then have a nice list of URLs in $urls.
    But you can't get contents from external pages...
    You will, however, need to have allow_url_fopen enabled in your php.ini.

    /EDIT: XML_HTMLSax is also an option. Probably a better one, since it's been accepted into PEAR.
    Last edited by Twey; 01-29-2007 at 10:08 PM.
    Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends franšais | entiendo espa˝ol | t˘i Ýt hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!

  6. #6
    Join Date
    Jul 2006
    Location
    Canada
    Posts
    2,581
    Thanks
    13
    Thanked 28 Times in 28 Posts

    Default

    Wow... there's alot of PHP functions there that are DOM related
    - Mike

  7. #7
    Join Date
    Jan 2007
    Posts
    25
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    whoops. wrong topic. sorry
    Last edited by jacksont123; 02-03-2007 at 11:20 PM.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •