My web application produces HTML5 output as a concatenation of a variable number of views. The end result is a mess of indentation:
</div> </div> <div id="content"> <div id="question-header"> <h1>
I want to indent the code to obscure the origin of individual views and to make the output easier to follow.
I have looked into the Tidy PHP extension but all my attempts to make it work with HTML5 have produced improper indenting.
The closest to what you are looking for in the PHP land is Dindent, https://github.com/gajus/dindent. Dindent is a HTML beautifier that uses regular expressions to indent the markup. This is different from Tidy, that acts as a DOM parser.
From the documentation:
There is a good reason not to use regular expression to parse HTML.
However, DOM parser will rebuild the whole HTML document. It will add
missing tags, close open block tags, or remove anything that’s not a
valid HTML. This is what Tidy does, DOM, etc. This behavior is
undesirable when debugging HTML output. Regex based parser will not
rebuild the document. Dindent will only add indentation, without
otherwise affecting the markup.
Dindent sole purpose is to indent the HTML markup. It allows to configure what elements to treat as inline and what elements to treat as block.
If you want to obscure the origin of individual views, I suggest you to minify the HTML. This will have the added benefit of reducing the document size.
As for making the HTML output easier to follow, browsers come with debug utilities that parse and render out DOM tree in an indented format, e.g. https://trac.webkit.org/wiki/WebInspector, http://getfirebug.com/.