Page 2 of 2 FirstFirst 12
Results 11 to 18 of 18

Thread: RegExp weirdness

  1. #11
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    Thanks, Twey. Anyways, what do you think about my latest approach vs. the one right before it:

    PHP Code:
    <?php  
    $theHost 
    $_SERVER['HTTP_HOST'];  
    $theBack $_SERVER['HTTP_REFERER'];  
    $theBParse parse_url($theBack); 

    if (
    $theBParse['host'] == $theHost
    echo 
    $theBack// or do whatever if referrer is from the same domain 
    else  
    echo 
    'other domain'// or do whatever if the referrer is from another domain 
    ?>
    vs:

    PHP Code:
    <?php 
    $thePat 
    '^http://' $_SERVER['HTTP_HOST']; 
    $theBack $_SERVER['HTTP_REFERER']; 
    if (
    ereg($thePat$theBack)) 
    echo 
    $theBack
    else 
    echo 
    'other domain'
    ?>
    I also had another approach using substr() and strlen(), and an equality comparison between:

    PHP Code:
    'http://' $_SERVER['HTTP_HOST'
    and and a substr of equal length counting from the beginning of:

    PHP Code:
    $_SERVER['HTTP_REFERER'
    But I think the best is the one at the top of this post. I'm still a great novice at PHP, but I think it combines maximum accuracy with with the lowest possible overhead for that accuracy.

    Unfortunately, the server I'm working on does not support the component parameter of parse_url(), otherwise it could have been even simpler, or at least more direct looking.
    Last edited by jscheuer1; 04-01-2009 at 06:25 AM. Reason: spelling
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  2. #12
    Join Date
    Jun 2005
    Location
    英国
    Posts
    11,876
    Thanks
    1
    Thanked 180 Times in 172 Posts
    Blog Entries
    2

    Default

    Yes, parse_url() is the way to go — it's designed for just this situation, and can be optimised based on that.
    Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!

  3. #13
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    Quote Originally Posted by Twey View Post
    Yes, parse_url() is the way to go — it's designed for just this situation, and can be optimised based on that.
    Yes, that's what I thought. Now, I'm a little surprised you haven't mentioned potential problems or issues with the basic idea of even looking for the referrer, let alone making a link based upon it.

    Do I take that you are silently saying that if gotten in this manner and used only to make a link back to the current site itself, that it's OK?

    In limited testing, if there is no referrer, there is no error, and my code will follow the 'other domain' path.

    I'm sure this could be abused, but I'm just thinking of offering a back button in certain cases, with other generic (hard coded) choices, or if no on site referrer is available, just the generic choices.

    One other thing, I'm thinking that as long as server security is up to snuff, no one can spoof a referrer in this situation as being from the same domain as the site and have a link created that goes off site or anywhere on site they couldn't reach via the address bar. Or am I mistaken?
    Last edited by jscheuer1; 04-01-2009 at 04:21 PM. Reason: add bit about spoofed referrers
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  4. #14
    Join Date
    Jun 2005
    Location
    英国
    Posts
    11,876
    Thanks
    1
    Thanked 180 Times in 172 Posts
    Blog Entries
    2

    Default

    Well, there's nothing intrinsically wrong with testing the referrer So long as you bear in mind that it may be nonsensical or non-existant, and always provide an alternative means of accessing anything important, you'll be fine.

    Referrer-spoofing is not a server issue but a client issue. It is a client header and as such the client can send anything it wants in that field, providing it isn't stripped or otherwise filtered by some intermediary. This means it needs to be treated with the same paranoia and skepticism as any other user input. Remember, too, that it may not be the browser directly but some piece of malicious software on the user's machine that sets the referrer, so XSS is a possibility.

    Your idea of using the referrer to add a 'back' button to your page is a terrible one. The browser already has such facilities; there's no need to duplicate them. The only time it should be necessary to provide your own navigation controls is when your site has its own internal structure that may not be related to the order of pages traversed by the browser.
    Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!

  5. #15
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    I was thinking about this in relation to a site I master, but then just got interested in the concept. At the same time, I'm formulating a more comprehensive approach to this phase of that site. It probably won't use what we've been discussing here. However, it (what we have been discussing) could help drive a pretty mean PHP breadcrumb scenario. But you don't think that would be secure?
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  6. #16
    Join Date
    Jun 2005
    Location
    英国
    Posts
    11,876
    Thanks
    1
    Thanked 180 Times in 172 Posts
    Blog Entries
    2

    Default

    It's perfectly secure so long as you escape the input before displaying it on the page, as you would with any other user input. In fact, even if you don't, it would require some peculiar setup to take advantage of it: a malicious filter, perhaps, or some form of malware on the client system, or a browser bug or feature (as far as I know, none such exists) that allowed one to specify the referrer in a link.

    I don't think that breadcrumbs based on the referrer are very useful. Again, it already exists in the browser (right-click your 'back' or 'forward' button some time). Breadcrumbs are about where you are in the site. For example, notice the breadcrumbs at the top of this page: DD Forums > General Coding > PHP > RegExp weirdness, even though the actual path I took to get here was more like New Posts > Page 2 > RegExp weirdness (four keystrokes to get from desktop to DD new posts, oh yes ). It simply wouldn't be useful to duplicate that.
    Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!

  7. #17
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    30,495
    Thanks
    82
    Thanked 3,449 Times in 3,410 Posts
    Blog Entries
    12

    Default

    OK, could you define:

    escape the input before displaying it on the page
    Does this mean filtering out anything that could be javascript and/or HTML code? If not, please be more specific. If so, is there a PHP function already for that, or must one make one's own?

    I'm pretty clever with code, but I'm really a novice at best when it comes to PHP.

    Also, in my code, nothing from the user gets displayed on the page until it has been determined that the referrer is from the same domain, and then only a link to that referrer. If there is no referrer, or if the referrer is from another domain, or doesn't exist, that's when the fall back hard coded include or echoed content would be shown - not secure enough in and of itself though I take it?
    - John
    ________________________

    Show Additional Thanks: International Rescue Committee - Donate or: The Ocean Conservancy - Donate or: PayPal - Donate

  8. #18
    Join Date
    Jun 2005
    Location
    英国
    Posts
    11,876
    Thanks
    1
    Thanked 180 Times in 172 Posts
    Blog Entries
    2

    Default

    Does this mean filtering out anything that could be javascript and/or HTML code? If not, please be more specific. If so, is there a PHP function already for that, or must one make one's own?
    Not necessarily filtering out entirely, but making sure the input is safe to be used in the context in which you intend to output it. This is a general principle that should be applied to all user input. For HTML, the PHP function htmlspecialchars() should do the task (basically, that means replacing < with &amp;lt;, & with &amp;, and " with &amp;quot. A different filter would need to be applied if you intended the input to be used as part of an SQL query or a shell command, for example.
    Also, in my code, nothing from the user gets displayed on the page until it has been determined that the referrer is from the same domain, and then only a link to that referrer. If there is no referrer, or if the referrer is from another domain, or doesn't exist, that's when the fall back hard coded include or echoed content would be shown - not secure enough in and of itself though I take it?
    Really, this is hardly a security issue at all — as I said above, the circumstances for a third party to inject harmful code using this feature would have to be exceptional to the point of considering the client machine effectively compromised already. If, hypothetically, an attacker was capable of altering the referrer on a whim, and you failed to handle it with the proper paranoia, it would be possible to write some session-stealing XSS code to the page and thereby hijack the user's account on your site. The string used to do so could very well contain a completely valid referrer, so simply checking for that will not suffice in terms of checking for validity of the whole (for example, http://www.johnssite.com/innocent/page.php#"><script>stealCookies();</script><br style="display:none;" class=").

    PHP is not a hard language to grasp, at least at a fundamental level (there's nothing particularly complicated in it, but it hasn't been thought out well and so there are a lot of non-obvious, inelegant, inconsistent, or otherwise completely stupid things to remember about more advanced features), but the main thing to remember is that it is completely content-agnostic. It doesn't know or care about the content you are writing; to PHP, it's just strings of bytes. As such, it doesn't try to make the content safe in any way: all untrusted content must be verified and/or escaped by you. Forgetting to do so is one of the biggest causes of security holes in PHP-powered sites. You have to think of what you would do when writing that code by hand, and remember to consider exceptional cases like invalid characters. By hand we would have to change a < to a &amp;lt; if we wanted to make sure that parts of the text weren't interpreted as something we didn't intend, so in PHP we have to as well — although the stakes are higher, because in many cases failure to do so will allow someone else to decide what gets interpreted and how.
    Twey | I understand English | 日本語が分かります | mi jimpe fi le jbobau | mi esperanton komprenas | je comprends français | entiendo español | tôi ít hiểu tiếng Việt | ich verstehe ein bisschen Deutsch | beware XHTML | common coding mistakes | tutorials | various stuff | argh PHP!

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •