htmlSanityCheck
htmlSanityCheck copied to clipboard
Validate the HTML itself
It would be nice if this tool could also validate the HTML.
You may be able to do this with JSoup directly, it looks like it can report parser errors.
I also found this Stack Overflow post that shows how to use JSoup to parse output from the w3c validator:
http://stackoverflow.com/questions/23737300/how-to-validate-html-using-java-getting-issues-with-jsoup-library
another option could be the (very old) jtidy: http://jtidy.sourceforge.net/howto.html
Doing a quick check I could not find a validation opportunity with JSoup. However, there seems to be a revived version of JTidy on GitHub: https://github.com/jtidy/jtidy which has released its latest version in 2023.