bluewalrus
03-11-2010, 05:31 PM
I'm trying to use the preg_replace to find greater than and less than symbols that aren't part of an html tag. This works in dreamweaver when I tested it but on my page it doesn't work any ideas? Thanks.
//Find less than and greater thans that arent part of an html tag
$patterns[] = '/[^ol,li,p,ul,img,/p,/ol,/li,/ul,em,/em,strong,/strong,br,table,tbody,tr,td,/table,/td,/tbody,/tr,sup,sub,/sup,/sub,a,/a,class=",start="]>/i';
$replacements[] = '>';
$patterns[] = '/<[^ol,li,p,ul,img,/p,/ol,/li,/ul,em,/em,strong,/strong,br,table,tbody,tr,td,/table,/td,/tbody,/tr,sup,sub,/sup,/sub,a,/a]/i';
$replacements[] = '<';
$contents = preg_replace($patterns, $replacements, $input);
echo htmlspecialchars($contents);
Warning: preg_replace() [function.preg-replace]: Unknown modifier 'p' in /home/bluewalr/public_html/christophermacdonald.net/protocol.php on line 182
The line 182 is the contents= preg_replace line, there are multiple things I'm swapping this is the only one that causes the errors though. When I say this works in dreamweaver I mean if I put the
<[^ol,li,p,ul,img,/p,/ol,/li,/ul,em,/em,strong,/strong,br,table,tbody,tr,td,/table,/td,/tbody,/tr,sup,sub,/sup,/sub,a,/a] in for the search value it will find the less than symbol.
james438
03-14-2010, 09:34 AM
Dunno how I missed this post. I enjoy these questions even though I am certainly no expert on the subject.
Problem 1: The answer is in the error message. See how you are using the /i modifier? the "i" comes right after the delimiter, which in this case is the "/" symbol. The delimiter states where the pattern starts and where it ends. If you want to use the delimiter in your pattern you will need to escape it. To escape the delimiter you will need to insert the backslash right before it and pretty much anything that is not a letter or a number. What you need to do is insert the backslashes where you need to. Since you did not escape your forward slash the PCRE engine figured it was a delimiter and thought the p was a modifier and since "p" is not one of the listed delimiters it gave you an error message.
Problem 2: You are trying to NOT match the HTML Tags. I can see that, but you placed them in a character class, so to use an abbreviated example: [^ol,li,p,ul,img,/p,/ol,/li,/ul] This will match everything except for commas, and the letters o,l,i,p,u,m,g,u,/. Instead you will need to put it into a subpattern (parentheses) as opposed to a character class (square backets). You also do not want to capture the HTML Tag, so we will say so by using a negative lookahead and a negative lookbehind. What we will now have is '/<(?!ol|li|p|ul|img|\/p|\/ol|\/li|\/ul)/i' and '/(?<!ol|li|p|ul|img|\/p|\/ol|\/li|\/ul)>/i'. Lookahead and lookbehind are also known as assertions. Assertions do not capture, so basically you are saying "capture the greater than symbol unless it is part of an HTML Tag". The signal for a negative lookahead is ?! and a negative lookbehind is ?<!, which state that the pattern you are looking for must not be proceeded by this pattern or preceded by this pattern respectively.
Problem 3: I suspect that the tags you are unintentionally leaving out <hr> or the ending part of hyperlinked words whether it be a single quote or a double quote. There are probably others, but you may be doing this intentionally. I'm not sure what you are trying to do exactly.
Anyway, try playing around with the following:
<?php
$patterns1 = '/(?<!ol|li|p|ul|img|\/p|\/ol|\/li|\/ul|em|\/em|strong|\/strong|br|table|tbody|tr|td|\/table|\/td|\/tbody|\/tr|sup|sub|\/sup|\/sub|a|\/a|class=\"|start=\")>/i';
$replacements1 = '>';
$patterns2 = '/<(?!ol|li|p|ul|img|\/p|\/ol|\/li|\/ul|em|\/em|strong|\/strong|br|table|tbody|tr|td|\/table|\/td|\/tbody|\/tr|sup|sub|\/sup|\/sub|a|\/a])/i';
$replacements2 = '<';
$string = preg_replace($patterns1, $replacements1, $string);
$string = preg_replace($patterns2, $replacements2, $string);
echo htmlspecialchars($contents);
?>
Sadly, I never got proficient in using arrays in PCRE which is why I am not using arrays in my finished product.
I kinda skimmed over most everything, so if you want me to elaborate on anything I said just let me know.
Side note: remember how "/" is used as a delimiter? You can use pretty much anything as a delimiter as long as it is not a letter, number, backslash, whitespace. For example, if you use the octothorpe (#) you won't have to worry about adding backslashes to your forward slashes.
Important: What you are trying to do will capture the greater than and less than symbols and replace them with their html equivalent, which will look exactly the same, so even though the code now works you won't notice.
Powered by vBulletin® Version 4.2.2 Copyright © 2021 vBulletin Solutions, Inc. All rights reserved.