Results 1 to 9 of 9

Thread: text without html tags retrieved by php's fopen(http:// function.

  1. #1
    Join Date
    Dec 2008
    Location
    RAJKOT,GUJARAT,INDIA
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Question text without html tags retrieved by php's fopen(http:// function.

    I want to retrieve one page from remote site by using fopen(http://www.somesite/somepage.html);
    next, I store the content of this page to $content variable with fgets function. entire content is stored.but the original page i try to retrieve has tables with 5 columns and 50 rows but my $content does not contain html tags,like <table> <tr> <td> etc.I wans the same. I havent used any preg_match to strip html tags. Any solution?

  2. #2
    Join Date
    Mar 2008
    Posts
    122
    Thanks
    17
    Thanked 5 Times in 5 Posts

    Default

    Hope This Helps:
    PHP Code:
    function strip_html_tags$text )
    {
        
    $text preg_replace(
            array(
              
    // Remove invisible content
                
    '@<head[^>]*?>.*?</head>@siu',
                
    '@<style[^>]*?>.*?</style>@siu',
                
    '@<script[^>]*?.*?</script>@siu',
                
    '@<object[^>]*?.*?</object>@siu',
                
    '@<embed[^>]*?.*?</embed>@siu',
                
    '@<applet[^>]*?.*?</applet>@siu',
                
    '@<noframes[^>]*?.*?</noframes>@siu',
                
    '@<noscript[^>]*?.*?</noscript>@siu',
                
    '@<noembed[^>]*?.*?</noembed>@siu',
              
    // Add line breaks before and after blocks
                
    '@</?((address)|(blockquote)|(center)|(del))@iu',
                
    '@</?((div)|(h[1-9])|(ins)|(isindex)|(p)|(pre))@iu',
                
    '@</?((dir)|(dl)|(dt)|(dd)|(li)|(menu)|(ol)|(ul))@iu',
                
    '@</?((table)|(th)|(td)|(caption))@iu',
                
    '@</?((form)|(button)|(fieldset)|(legend)|(input))@iu',
                
    '@</?((label)|(select)|(optgroup)|(option)|(textarea))@iu',
                
    '@</?((frameset)|(frame)|(iframe))@iu',
            ),
            array(
                
    ' '' '' '' '' '' '' '' '' ',
                
    "\n\$0""\n\$0""\n\$0""\n\$0""\n\$0""\n\$0",
                
    "\n\$0""\n\$0",
            ),
            
    $text );
        return 
    strip_tags$text );

    (Credit: http://nadeausoftware.com/articles/2..._tags_web_page)

  3. #3
    Join Date
    Dec 2008
    Location
    RAJKOT,GUJARAT,INDIA
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    I want the page as it is,means I want all tags which the page originally have, I think,your code is to remove the tags.

  4. #4
    Join Date
    Mar 2008
    Posts
    122
    Thanks
    17
    Thanked 5 Times in 5 Posts

    Default

    Oh, most apologies, Errm have you tried the file_get_contents function?

  5. #5
    Join Date
    Dec 2008
    Location
    RAJKOT,GUJARAT,INDIA
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    let me try, file_get_content may work.

  6. #6
    Join Date
    Dec 2008
    Location
    RAJKOT,GUJARAT,INDIA
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    i don't know why,but fopen(http:// function returns entire remote webpage without any html tags, it replaces all tags with ^M or Carriage Return character. for example If there is <td>subject</td>, it will retrieve ^Msubject^M. ,BTW, file_get_content works fine. i m using php 4.2. on apache on linux server. I tried remote page from remote server as well as local page on same server.But now, My purpose is solved with file_get_content. thx.

  7. #7
    Join Date
    Dec 2008
    Location
    Nigeria
    Posts
    95
    Thanks
    3
    Thanked 8 Times in 8 Posts

    Default

    <?php
    $handle = fopen("http://www.dynamicdrive.com/", "rb");
    $contents = '';
    while (!feof($handle)) {
    $contents .= fread($handle, 8192);
    }
    fclose($handle);
    echo "<xmp>$contents</xmp>";
    ?>

    To actually store the data
    replace last line with
    $contents=urlencode("$contents");
    then save it to ur dbase like that.
    To retrieve, just use your urldecode to reverse the encoding.
    I am using php 5 now, but i used this same code when i was using php 4.2 so it should work!

  8. #8
    Join Date
    Mar 2008
    Posts
    122
    Thanks
    17
    Thanked 5 Times in 5 Posts

    Default

    OK, Try this:
    PHP Code:
    <?php
    $page 
    '';
    $fh fopen('http://www.yourwebsitehere.com/','r') or die('Unable to open file.');
    while(! 
    feof($fh)){
        
    $page .= fread($fh1048576);
    }
    echo 
    $page;
    fclose($fh);
    ?>
    Alternativly you can use cURL:

    PHP Code:
    <?php
    $c 
    curl_init('http://www.yourwebsitehere.com/');
    curl_setopt($cCURLOPT_RETURNTRANSFER1);
    $page curl_exec($c);
    curl_close($c);
    ?>

  9. #9
    Join Date
    Dec 2008
    Location
    RAJKOT,GUJARAT,INDIA
    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    yes it is also working

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •