zend-dom icon indicating copy to clipboard operation
zend-dom copied to clipboard

No pissibility to use libxml options in loadHTML

Open step307 opened this issue 8 years ago • 7 comments

This is a feature request.

Accordingly to: https://github.com/zendframework/zend-dom/blob/245d75d1cce819cb8da8726cf9c9ba563fa5d8f0/src/Document.php#L254

loadHTML is called without any options and there is no way to configure them.

Exactly in my case, I'm missing LIBXML_HTML_NOIMPLIED to operate partial HTMLs

step307 avatar Jan 26 '17 14:01 step307

@step307 Can you explain your problem or use case? The Background of my question: the class is called Document, not Fragment or Partial.

Thanks!

froschdesign avatar Jan 27 '17 07:01 froschdesign

@froschdesign , exactly in my case I want to parse and modify pieces of HTML, not the whole document. Default behavior is, that DomDocument always adds "missing" and

tags, which is absolutely undesired.

LIBXML_HTML_NOIMPLIED exactly "turns off the automatic adding of implied html/body... elements" LIBXML_HTML_NODEFDTD "prevents a default doctype being added when one is not found"

There are a lot of other options which definitely make sense. Full list: http://php.net/manual/en/libxml.constants.php

step307 avatar Jan 27 '17 09:01 step307

@step307

Default behavior is, that DomDocument always adds "missing" and tags, which is absolutely undesired. … There are a lot of other options which definitely make sense.

I know this, but it doesn't matter. What is the concrete problem. Can you provide a short example which illustrate the problem and why the current implementation does not work?

And again: the class is named Document and not Fragment or Partial – therefore I ask!

froschdesign avatar Jan 27 '17 09:01 froschdesign

This is not a bug report, but a feature request. There is no problem and everything surely works as designed.

P.S. I don't find any Fragment or Partial in this module and therefore make a proposal for existing Document.

step307 avatar Jan 27 '17 10:01 step307

@step307 Sorry, you misunderstand me. Yes, it is a feature request, but why is the current implementation not enough? It works already for partials:

$results = Zend\Dom\Document\Query::execute(
    '.block__headline',
    new Zend\Dom\Document(
        '<div class="block"><h1 class="block__headline">Foobar</h1></div>'
    ),
    Zend\Dom\Document\Query::TYPE_CSS
);

var_dump(count($results)); // 1

If we extend the component to support the additional Libxml parameters, where is the benefit? Or do you have a different use case?

froschdesign avatar Jan 27 '17 16:01 froschdesign

@froschdesign , as I wrote I need to parse and modify a piece of HTML. Here is small illustration:

Imagine you want to update title in your example:

$document = new \Zend\Dom\Document(
        '<div class="block"><h1 class="block__headline">Foobar</h1></div>'
    );

    $results = \Zend\Dom\Document\Query::execute(
        '.block__headline',
        $document,
        \Zend\Dom\Document\Query::TYPE_CSS
    );

    $results->current()->textContent = 'Foobar2';

    var_dump($document->getDomDocument()->saveHTML());

you will get extra DOCTYPE, html and body tags in the result.

step307 avatar Jan 27 '17 16:01 step307

This repository has been closed and moved to laminas/laminas-dom; a new issue has been opened at https://github.com/laminas/laminas-dom/issues/2.

weierophinney avatar Dec 31 '19 21:12 weierophinney