Page 1 of 4 123 ... LastLast
Results 1 to 10 of 34

Thread: Help with a preg_match

  1. #1
    Join Date
    Dec 2008
    Posts
    48
    Thanks
    11
    Thanked 0 Times in 0 Posts

    Default Help with a preg_match

    Hello,

    I am trying to find links in members pages, making sure a pre-determined link is present on the html. I am having some problems tho.

    Some members use a full url for the link, some only the base dir/file.

    So i use the function basename() on the pre-determined full url i want to check to make sure i only check the base dir.

    i am using preg_match as such:
    PHP Code:
    preg_match ("|<[aA] (.+?)".basename($links_full_url)."(.+?)>(.+?)<\/[aA]>|i"$result$matches); 
    This works for links like:
    http://www.dynamicdrive.com/forums/ i end up searching for "forums" and if the page i look into has:

    HTML Code:
    <a href="/forums/">whatever</a>
    or
    HTML Code:
    <a href="http://www.dynamicdrive.com/forums/">whatever</a>
    or
    HTML Code:
    <a href="forums">whatever</a>
    all valid links, it will find them and its fine.

    however!

    If the full url link is something like http://www.dynamicdrive.com/forums.php i end up searching for "forums.php" and my preg_match cant find it. It can find "forums.p" tho, strange

    Any help please?

  2. #2
    Join Date
    May 2007
    Location
    Boston,ma
    Posts
    2,127
    Thanks
    173
    Thanked 207 Times in 205 Posts

    Default

    Code:
    "/<a href=\"(.*?)"/i"
    I don't see what this relates to finding the full link. The full link should be contained in the href of the a tag (assuming your not using js).

    So i use the function basename() on the pre-determined full url i want to check to make sure i only check the base dir.
    Corrections to my coding/thoughts welcome.

  3. #3
    Join Date
    Dec 2008
    Posts
    48
    Thanks
    11
    Thanked 0 Times in 0 Posts

    Default

    ok i should have just asked why with:

    Code:
    $result = '<a href="mypage.php">whatever</a>';
    the following preg_match

    PHP Code:
    preg_match ("|<[aA] (.+?)mypage.php(.+?)>(.+?)<\/[aA]>|i"$result$matches); 
    has no matches found.

    looking in te href would not work for me.

    imagine:

    Code:
    <link rel='prev' title='whatever' href='whatever' />
    the link would be found but not in a A tag, cheating my lookup.

  4. #4
    Join Date
    May 2007
    Location
    Boston,ma
    Posts
    2,127
    Thanks
    173
    Thanked 207 Times in 205 Posts

    Default

    What are you looking for in that example? Your first braket finds the href=" then your second gets the closing " and the third grabs the contents of the "a".

    The preg_match as I know it doesn't use the pipes (|) but uses the forward slash (/) around the expression.

    My example also requires the element be an "a" which wouldn't find your "link" example.
    Corrections to my coding/thoughts welcome.

  5. #5
    Join Date
    Jan 2007
    Location
    Davenport, Iowa
    Posts
    2,385
    Thanks
    100
    Thanked 113 Times in 111 Posts

    Default

    hypothetically what sort of results do you want to get with

    $result = '<a href="mypage.php">whatever</a>';

    When I use

    Code:
    <pre><?php
    $result = '<a href="mypage.php">whatever</a>';
    preg_match ("|<[aA] (.+?)mypage.php(.+?)>(.+?)<\/[aA]>|i", $result, $matches);
    print_r($matches);
    ?></pre>
    I get the following:
    Code:
    Array
    (
        [0] => whatever
        [1] => href="
        [2] => "
        [3] => whatever
    )
    The "whatever" listed above is hyperlinked.

    As a side note, you can use the pipe for a delimiter, but I would highly recommend you do not and stick with the old standby of the forward slash / like bluewalrus suggested.

    Code:
    "/<a href=\"(.*?)"/i"
    needs to have the third quote escaped. The href in this example is not optional though.

    Just so I understand what you are looking for you are looking for web addresses correct? Addresses that could be in the form of:

    1 http://www.this.com
    2 <a href="this.com">yo</a>
    3 <a href="www.this.com">yo</a>
    4 <a href="https://www.this.com">yo</a>

    the corresponding matches you want will be the following:

    1 http://www.this.com
    2 this.com
    3 www.this.com
    4 https://www.this.com

    Is that correct?
    To choose the lesser of two evils is still to choose evil. My personal site

  6. The Following User Says Thank You to james438 For This Useful Post:

    nicmo (11-22-2010)

  7. #6
    Join Date
    Jan 2007
    Location
    Davenport, Iowa
    Posts
    2,385
    Thanks
    100
    Thanked 113 Times in 111 Posts

    Default

    Actually, why not just use a str_replace to format the data or a sub_str() to detect if there is an anchor being used? The users will know that the field is for formatting links, so the range of things people will attempt to enter into the field will be limited. If this were a 10 page document then we might want to use a complicated pcre to locate and/or format the web addresses, but that does not appear to be the case here.
    To choose the lesser of two evils is still to choose evil. My personal site

  8. #7
    Join Date
    Dec 2008
    Posts
    48
    Thanks
    11
    Thanked 0 Times in 0 Posts

    Default

    ok, i kinda confused the all thing, sorry about that. By mistake i always added HTML in the $result, thats wrong.

    $result only contains a URL like "http://www.dynamicdrive.com/forums.php" on which i run basename() and end up searching only for "forums.php".

    And the problem is, my preg match cant find "forums.php"

    on

    HTML Code:
    <a href="forums.php?whatever=1">whatever</a>
    or

    HTML Code:
    <a href="forums.php">whatever</a>
    BUT

    for links like "http://www.dynamicdrive.com/forums/"

    preg_match can find "forums"

    on

    HTML Code:
    <a href="/forums/">whatever</a>
    or

    HTML Code:
    <a href="http://www.dynamicdrive.com/forums/">whatever</a>
    so sorry about my previous mistake, i bet i made evrything look very confusing hehe

  9. #8
    Join Date
    Jan 2007
    Location
    Davenport, Iowa
    Posts
    2,385
    Thanks
    100
    Thanked 113 Times in 111 Posts

    Default

    Give this a whirl:

    Code:
    <?php
    $result = '<a href="http://www.dynamicdrive.com/forums/?this=3">whatever</a>';
    preg_match ('/(<a href=\")(.*?)(\?|\/\"|\")/i', $result, $matches);
    $match=$matches[2];
    if (substr($match,-1,1)=='/') $match=substr_replace($match,'',-1,1);
    $match=explode("/","$match");
    $found=end($match);
    print $found;
    ?>
    I can't help but think that there must be an easier way to do this without regex, but I am not understanding exactly what your script is supposed to do. I am sure that the pcre could be tightened up a bit, but it works.

    Try a few different things and let us know what fails.
    To choose the lesser of two evils is still to choose evil. My personal site

  10. The Following User Says Thank You to james438 For This Useful Post:

    nicmo (11-23-2010)

  11. #9
    Join Date
    Dec 2008
    Posts
    48
    Thanks
    11
    Thanked 0 Times in 0 Posts

    Default

    I have a recip link system, i need this to check if a user is linking back to another member. I use curl to get his page and then look in the html for the link, if i find it all is ok.

    this is what i have right now:

    PHP Code:
    <?php
    $result 
    'more html <a href="forums.php?asd=1">whatever</a> more html';

    $full_url 'http://www.dynamicdrive.com/forums.php?asd=1';

    preg_match ("|<[aA] (.+?)".basename($full_url)."(.+?)>(.+?)<\/[aA]>|i"$result$matches);


    echo 
    basename($full_url)."<br><br>";
    if (
    count($matches) > 0)
    {
    echo 
    "found";
    }
    else
    {
    echo 
    "not found";
    }  
    ?>
    copy and paste that into a php file and you will see: not found.

  12. #10
    Join Date
    Dec 2008
    Posts
    48
    Thanks
    11
    Thanked 0 Times in 0 Posts

    Default

    another example where preg_match fails. strpos however can find it. My problem is i need to make sure its inside an A tag to make sure its a propper link.

    PHP Code:
    <?php
    $result 
    '<strong></strong><font face="Verdana" size="2"><br>
    » </font><strong><a href="OtherMethods.php"><font face="Verdana" size="2">Other
    Methods</font></a></strong>'
    ;

    $full_url 'http://www.site.com/OtherMethods.php';

    preg_match ("|<[aA] (.+?)".basename($full_url)."(.+?)>(.+?)<\/[aA]>|i"$result$matches);


    echo 
    basename($full_url)."<br><br>";

    $mystring $result;
    $findme   basename($full_url);
    $pos strpos($mystring$findme);

    if (
    $pos === false) {
    echo 
    "The string was not found in the string <br>";
    } else {
        echo 
    "The string  was found in the string";
        echo 
    " and exists at position $pos <bR>";
    }

    if (
    count($matches) > 0)
    {
    echo 
    "preg_match found";
    }
    else
    {
    echo 
    "preg_match  not found";
    }
    ?>

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •