Page 1 of 2 12 LastLast
Results 1 to 10 of 18

Thread: RegExp weirdness

  1. #1
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default RegExp weirdness

    Consider this:

    PHP Code:
    <?php
    $theBack 
    $_SERVER["HTTP_REFERER"];
    $thePat1 "/^http:\/(\/www\.)|(\/)some\.com/";
    $thePat2 "/^(http:\/\/some\.com)|(http:\/\/www\.some\.com)/";
    if(
    eregi($thePat1$theBack))
    echo 
    $theBack ' pat one';
    if(
    eregi($thePat2$theBack))
    echo 
    $theBack ' pat two';
    ?>
    Shouldn't these two patterns match the same strings:

    http://www.some.com/whatever

    and:

    http://some.com/whatever

    If not, why not?

    In my test environment, thePat1 matches http://some.com and thePat2 matches http://www.some.com

    The result being that I'm not 100% sure that either can be relied upon to exclude other domains. Is there a better approach to this. I just want to make sure that the referring document is from the same domain before writing out a back button to that page.
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  2. #2
    Join Date
    Apr 2008
    Location
    Limoges, France
    Posts
    395
    Thanks
    13
    Thanked 61 Times in 61 Posts

    Default

    Code:
    $theBack = 'http://www.|/some.com/'; // Matches pat1 also
    I'll try to work out one pattern that matches both later. I think that when using the | pipe you need to be inside parenthesis.

    For example:

    Code:
    (this | that) // not (this) | (that)
    I'm no regex expert, but I love trying to figure them out.

    Just to be clear, you would like one regex that matches both http://some.com and http://www.some.com ?
    Last edited by JasonDFR; 03-30-2009 at 05:07 PM.

  3. #3
    Join Date
    Apr 2008
    Location
    Limoges, France
    Posts
    395
    Thanks
    13
    Thanked 61 Times in 61 Posts

    Default

    I came up with this:

    PHP Code:
    <?php

    $match 
    'http://some.com/';

    $pat '/^http:\/\/(www.|)some.com\//';

    if ( 
    preg_match($pat$match) ) {
        echo 
    'Match!<br>';
        echo 
    'Pat: ' $pat '<br>';
        echo 
    'Match: ' $match;
    } else {
        echo 
    'no match';
    }

    exit;
    As always I would love to see other regexs. Especially if they are more elegant. hehe...

    jscheuer, let me know if this works for you. And BTW, use preg, not ereg.
    Last edited by JasonDFR; 03-30-2009 at 05:52 PM. Reason: Deleted $ from the end of my regex. Too used to putting it there.

  4. #4
    Join Date
    Sep 2008
    Location
    Bristol - UK
    Posts
    842
    Thanks
    32
    Thanked 132 Times in 131 Posts

    Default

    So the "(www.|)" is like saying "www." or "" right? Either this or nothing?

  5. #5
    Join Date
    Jun 2007
    Posts
    543
    Thanks
    3
    Thanked 78 Times in 78 Posts
    Blog Entries
    1

    Default

    yes (www\.|) is like (www\.)?
    Edit: Added below:

    Here is a useful little code snippet to test regexs. You can add as many regexs as you want:
    Code:
    $pat=Array(
    "com"=> "/^(http:\/\/)?(www\.)?([^.]*?)\.com/",
    "s com"=> "/^(http(s)?:\/\/)?(www\.)?([^.]*?)\.com/",
    "sub"=> "/^(http(s)?:\/\/)?(www\.)?([^.]*?\.)?([^.]*?)\.com/",
    "multiple domain"=> "/^(http(s)?:\/\/)?(www\.)?([^.]*\.)?([^.]*?)(\..{2}\..{2}|\..{3})/"
    );
    $padding="&nbsp;&nbsp;";
    $string="https://bob.site.co.uk";
    
    echo "String: <strong>".$string."</strong><br><br>";
    
    foreach($pat as $k=>$v) {
      $pad=$padding;
      echo 'Pattern <strong>'.$k.'</strong> <em>' . $v . '</em><br>'.$pad;
      if ( preg_match($v, $string, $matches) ) {
        echo 'Match: ';
        echo cust_print($matches, $pad, 2);
      } else {
        echo 'No Match';
      }
      echo "<br><br>";
    }
    
    function cust_print($a, $p="", $n=1) {
      if(!is_array($a)) {
        return $a;
      }
      $s="Array(";
      foreach($a as $k=>$v) {
        $s.="<br>".str_repeat($p, $n).'['.$k.'] => '.cust_print($v, $p, $n+1);
      }
      $s.="<br>".str_repeat($p, $n-1).")";
      return $s;
    }
    Last edited by Master_script_maker; 03-31-2009 at 12:46 AM.
    [Jasme Library (Javascript Motion Effects)] My Site
    /\/\@§†ê® §©®¡þ† /\/\@|{ê®
    There are 10 kinds of people in the world, those that understand binary and those that don't.

  6. #6
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    This is all pretty much nonsense or at least inefficient to me. In javascript we can just do:

    Code:
    <script type="text/javascript">
    var dr = 'http://www.some.com/whatever.htm',
    r = new RegExp('^http:\\/(\\/www\\.|\\/)some\\.com');
    alert(r.test(dr)); //alerts true
    dr = 'http://some.com/whatever.htm'
    alert(r.test(dr)); //alerts true
    dr = 'http://www.someother.com/whatever.htm'
    alert(r.test(dr)); //alerts false
    dr = 'http://someother.com/whatever.htm'
    alert(r.test(dr)); //alerts false
    </script>
    Is there no equivalent in PHP?

    If I have the document.referrer string as the var dr, a simple test of either one of these (not both - only one of them is required - it's just that either will work):

    Code:
    /^http:\/(\/www\.|\/)some\.com/.test(dr)
    or:

    Code:
    /^http:\/(\/www\.)|(\/)some\.com/.test(dr)
    will tell me if it comes from the some.com domain or not. There must be something equally as simple in PHP, perhaps even simpler, as In many cases PHP is simpler than javascript - perhaps something as simple as:

    HTML Code:
    isSameDomain($_SERVER["HTTP_REFERER"])
    Is there no equivalent of the javascript test() method in PHP - or better yet - a more efficient way of telling if a given URL comes from the same domain as the present page?
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  7. #7
    Join Date
    Jan 2008
    Posts
    4,168
    Thanks
    28
    Thanked 628 Times in 624 Posts
    Blog Entries
    1

    Default

    I guess there is, although I don't know much about reg exp. You could use preg_match, if the value is 0, then it will not be true.
    Jeremy | jfein.net

  8. #8
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    You guess? What a wimpy answer! Anyways, I found this rather simple approach after much trial and error:

    PHP Code:
    <?php
    $thePat 
    '^http://' $_SERVER['HTTP_HOST'];
    $theBack $_SERVER['HTTP_REFERER'];
    if (
    ereg($thePat$theBack))
    echo 
    $theBack;
    else
    echo 
    'other domain';
    ?>
    I'm thinking I will either echo a link back to the previous page ($_SERVER["HTTP_REFERER"]) or (if the test fails) include a file that has a menu of various on site pages that might be appropriate choices. I'm just wondering if there is anything that looks unworkable/dangerous/stupid/etc. here or not.
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  9. #9
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    Perhaps even more to the point:

    PHP Code:
    <?php 
    $theHost 
    $_SERVER['HTTP_HOST']; 
    $theBack $_SERVER['HTTP_REFERER']; 
    $theBParse parse_url($theBack);

    if (
    $theBParse['host'] == $theHost)
    echo 
    $theBack// or do whatever if referrer is from the same domain
    else 
    echo 
    'other domain'// or do whatever if the referrer is from another domain
    ?>
    As I said, this sort of thing is usually simpler in PHP than in javascript, which is why I couldn't understand how complicated the answers were getting.
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  10. #10
    Join Date
    Jun 2005
    Location
    英国
    Posts
    11,876
    Thanks
    1
    Thanked 180 Times in 172 Posts
    Blog Entries
    2

    Default

    The thing to consider is that | has a very low operator precedence — that's why you usually see an alternative enclosed in brackets.

    Code:
    $thePat1 = "/^http:\/(\/www\.)|(\/)some\.com/";
    'Match either "http://www." at the start of the string, or "/some.com" anywhere in the string.'

    Code:
    $thePat2 = "/^(http:\/\/some\.com)|(http:\/\/www\.some\.com)/";
    'Match either the start of the string followed by "http://some.com" or "http://www.some.com/" anywhere in the string.'
    Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!

  11. The Following User Says Thank You to Twey For This Useful Post:

    jscheuer1 (03-31-2009)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •