Results 1 to 2 of 2

Thread: Regex - Stripping content from comments in html

  1. #1
    Join Date
    Mar 2006
    Posts
    600
    Thanks
    5
    Thanked 4 Times in 4 Posts

    Default Regex - Stripping content from comments in html

    ok im totally confused with this regex stuff. Im not really finding a whole lot of places that teach you regex.

    But anyways here is what I am trying to accomplish.
    I have over 100 .html pages that need content stipped from it and stored into variables so I can manipulate it with php.

    There are comments in these .html files that are like this
    Code:
    <!-- publish type="textbox" name="About Us Content" -->Stuff I need Stripped<!-- /publish -->
    I need the stuff in between that saved as a variable.

    All the pages have the same amount of comment tags and these are the exact ones in order:
    Code:
    <!-- publish type="textbox" name="About Us Content" -->
    About Us Stuff That I Need
    <!-- /publish -->
    
    <!-- publish type="textbox" name="Calendar Content" -->
    Calendar Stuff That I Need
    <!-- /publish -->
    
    <!-- publish type="textbox" name="News Content" -->
    News Stuff That I Need
    <!-- /publish -->
    Here is the php code that I currently have right now. I dont know what to put in the $regex variable.
    PHP Code:
    <?php
    $url 
    "http://localhost/pathtofile/etc.htm";
    $html file_get_contents($url);

    $regex "";
    preg_match_all($regex$html$matches);

    foreach(
    $matches[0] as $div) {
    echo 
    $div;
    }
    ?>
    It would be really cool if someone could help out..

  2. #2
    Join Date
    Mar 2006
    Posts
    600
    Thanks
    5
    Thanked 4 Times in 4 Posts

    Default

    Ok so I have solved it . Here is what I cam up with.
    PHP Code:
    <?php
    $url 
    "http://www.localhost/html.html";
    $html file_get_contents($url);


    $change1 str_replace("<!-- publish""<publish"$html);
    $change2 str_replace("<!-- /publish -->""</publish>"$change1);
    $change3 strip_tags($change2'<publish>');
    $change4 str_replace("-->"">"$change3);

    preg_match_all('|<publish[^>]*?>(.*?)</publish>|si',$change4,$matches);

    echo 
    "<textarea name=\"\" cols=\"115\" rows=\"30\">" $matches "</textarea>";

    print_r($matches);
    ?>
    Last edited by benslayton; 07-19-2008 at 10:45 AM.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •