-
John, unless you verify that the data sent is indeed from a checkbox, there is a possibility that someone has made a duplicate form and submitted it with a text input instead of a checkbox. Or they used Javascript to set a non-binary (text) value for that input. So to be entirely secure, every input (of any type) must be either validated or escaped. One way to do this is to loop through the $_POST array, if you are blindly submitting the data to an email or website; but if you are only selecting a few items from the array they can be handled individually.
-
Is there a PHP function for that, or should a regular expression and something like preg_replace or preg_match be used?
-
for database insertion, mysql_real_escape_string() is all you need.
for display, it depends on the situation: if you want only plain text, you can use strip_tags(), if you want to display html tags (as text, not markup), you can use htmlentities().
If you want to allow some tags but remove others, you're best off using a custom regex via preg_replace() (or using bbcode tags instead).
-
Like all of the options that traq listed, the question you are asking, John, must have a context.
If you are displaying it in HTML, htmlentities() is the simplest answer. That will disable all HTML and along with it Javascript, CSS, etc. Generally speaking that's what you'll use to escape a form that may be vulnerable to XSS.
But there are lots of other situations for escaping-- databases, emails, etc.
I'm partly repeating some of what you said, traq, but I hope I'm adding info, not just being redundant :)
-
If you go back to the OQ (original question), it's for email to the webmaster. So, what should be used for that? The OP is using htmlentities, and wants to know if that's sufficient. From what you two are saying, it sounds as though it's not. What would you recommend for email?
-
What might be a vulnerability of email? What markup is active in an email program? Aside from HTML I don't know of anything that might be relevant. So htmlentities() should be fine. And if the email is not being sent (or viewed) as HTML, then you don't even need to do that-- most email programs are smart enough to ignore code unless it's specifically intended (such as through html headers for the message).
I think our posts above are suggesting alternative methods might be needed for other things with forms in general, not specific to sending as an email.
The only specific kind of email-related escaping I can think of is what might be parsed automatically such as a link. So you could attempt to "escape" all links by removing, for example, "http", and that's not something for which a default function exists in PHP.
-
yes, a lot of it depends on the email client.
however, IMO, if you take normal precautions and the user's email client decides to do something stupid anyway (like taking text and turning it into a block of script just because it looks like one), then it's not your problem. The user needs to get a different email client.
The number one issue here is whether or not the email is being sent as plain text or with a html mime-type. By default, email is sent as plain text. If you want php to send html emails, you need to send the mime-type along. So, in theory, if it's a plain-text email, no particular precautions need to be taken.
I don't know if any email client would automatically change the mime-type based on the content (for example, if the body of the email included html markup or something like
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
), but I can imagine it happening, and if there is such a loophole, then I'm sure, at some point, someone's exploited it by entering that very line into a text field.
Therefore, to be on the safe side, it might be prudent to use htmlentities(), regardless - changing the above example into
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
. It wouldn't hurt anything, in any case.
-
ok let me try to remember what i wanted to say
first off thanks everyone, interesting stuff.
im using gmail.
sounds like to be safe, use htmlenities or w/e. and according to drj, to address the checkbox possibly being an issue, i cant use htmlentitites, so address it thru $_post?
PHP Code:
## $new_string = ereg_replace("[^A-Za-z0-9]", "", $body);
mail($myaddress, 'Contact Form Submission', $new_string )
thatll solve it i guess? if so how do i change it to accept -, @, ., etc. is !@#$%^&*() safe to accept thru email?
thx so much everyone
-
you can use htmlentities() on a checkbox - it won't have any effect under normal circumstances, but may prevent problems if someone spoofs your form and tries to enter code using the checkbox's POST name.
BTW, the regex you show above will remove anything non-alphanumeric (including whitespace, symbols, etc.) which, I suspect, may not be what you're trying to do.
Forexampleitwouldcreatesentenceslikethis (<--note the lack of punctuation.)
If you want to simply remove all html tags (effectively killing javascript), use strip_tags().
-
I was thinking something a little more robust. Tags should be stripped or converted to entities. But I was thinking why not kill everything not on lines 2 through 7 excluding the DEL character on the ASCII chart:
http://en.wikipedia.org/wiki/File:ASCII_Code_Chart.svg
That would mean only allowing hex 20 through hex 7e (space to tilde). The only important things missing would be tabs and line breaks (\s). So something like:
PHP Code:
$new_string = strip_tags(preg_replace("/[^\x20-\x7e\s]/", '', $body));
Notes:
- I used preg instead of ereg, ereg has been deprecated.
- Unless you still need the original contents of $body, there's no reason to create a new variable:
PHP Code:
$body = strip_tags(preg_replace("/[^\x20-\x7e\s]/", '', $body));
That would turn this:
Quote:
<span>Some stuff©</span>
into:
It occurred to me that you can use the literal characters in the regex, so I did and made a working demo:
PHP Code:
<!DOCTYPE html>
<html>
<head>
<title>Strip Tags & Machine Code ('high' and 'low' ASCII)</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<?php
$str = "<span>Some stuff©</span>\r<div>More Stuff@stuff.com</div><script type='text/javascript'>
Some Destructive Javascript Code That Won't Run Without Script Tags
</script>";
$str = strip_tags(preg_replace("/[^ -~\s]/", '', $str));
echo "<textarea cols=50 rows=5>$str</textarea>";
?>
</body>
</html>