View Full Version : Quick and dirty way to produce valid HTML markup?

07-16-2016, 01:25 AM
What do folks think about this method of taking an invalid string of HTML and making it into a valid one?

$dirty = "<div>some content</div></div></p><img src='bob.jpg'>
$x = new DOMDocument;
$clean = @$x->saveHTML();
$str = preg_replace('/^.*<body>|<\/body>.*$/m', '', $clean);
echo implode("\n", array_slice(explode("\n", $str), 2));

07-21-2016, 10:02 PM
John, should this work when I put it inside a php-file? I can't get it to work.

07-22-2016, 02:29 AM
PHP 5+ required. I've made some refinements** since, but yes, as long as PHP's DOMDocument class is supported, it should more or less work.

**said refinements (most of them at least*):

header('Content-Type: text/html; charset=utf-8'); // most often wise - adjust if necessary
$dirty = "<div>some content©</div></div></p><img src='bob.jpg'>
<div>Hello! €</div></p><p></div>";
$x = new DOMDocument;
$clean = $x->saveHTML();
$str = implode("\n", array_slice(explode("\n", $clean), 1));
echo trim(preg_replace('/^.*<body>|<\/body>.*$/m', '', $str));

*some refinement may be required for any particular purpose. If tidy:


is available, use that. I'm interested in this approach (the one without tidy) because I'm working on code that already requires PHP 5.2 and that often will not be run in environments where tidy is available.

Demo of the code in this post:


Use the browser's view source and compare to the $dirty string.

07-22-2016, 09:55 AM
PHP 5+ required.