xmlquery icon indicating copy to clipboard operation
xmlquery copied to clipboard

Missing AttributeNode prefix and namespace URI

Open fgateuil opened this issue 2 years ago • 2 comments

Hi,

I'm trying to find attribute values within a XML document but the returned data seems erroneous.

Description

When I query an XML to get a specific node attribute with namespace (for instance //@xlink:href), the returned xmlquery.Node is missing the prefix and namespace URI.

Steps to reproduce

package main

import (
	"fmt"
	"strings"

	"github.com/antchfx/xmlquery"
)

func main() {
	xml := `<?xml version="1.0"?>
<root xmlns:xlink="http://www.w3.org/1999/xlink">
	<node xlink:href="http://www.github.com">Some text...</node>
</root>`

	root, _ := xmlquery.Parse(strings.NewReader(xml))
	node, _ := xmlquery.Query(root, "//@xlink:href")
	fmt.Println("NamespaceURI:", node.NamespaceURI)
	fmt.Println("Prefix:", node.Prefix)
	fmt.Println("Data:", node.Data)
}

Expected result

NamespaceURI: http://www.w3.org/1999/xlink
Prefix: xlink
Data: href

Actual result

NamespaceURI:
Prefix:
Data: href

Solution proposal

In github.com/antchfx/xmlquery/query.go#getCurrentNode:

func getCurrentNode(it *xpath.NodeIterator) *Node {
	n := it.Current().(*NodeNavigator)
	if n.NodeType() == xpath.AttributeNode {
		childNode := &Node{
			Type: TextNode,
			Data: n.Value(),
		}
		return &Node{
			Parent:       n.curr,
			Type:         AttributeNode,
			// START MODIFICATION
			NamespaceURI: n.NamespaceURL(),
			Prefix:       n.Prefix(),
			// END MODIFICATION
			Data:         n.LocalName(),
			FirstChild:   childNode,
			LastChild:    childNode,
		}
	}
	return n.curr
}

Additional information

If it appears that I just misused the library, what is the correct way to do please ? My main use case is as follows:

  • find all the @xlink:href attributes in the document;
  • reset the attribute value to another value.

fgateuil avatar Jun 29 '23 18:06 fgateuil

Missing to consider attribute nodes prefix and Namespace URL.

You can use the below code to find a parent node node and then iterate over all its attribute values.

	node, _ := xmlquery.Query(root, "//node[@xlink:href]")
	for _, attr := range node.Attr {
		fmt.Println("NamespaceURI:", attr.NamespaceURI)
		fmt.Println("Prefix:", attr.Name.Space)
		fmt.Println("Data:", attr.Name.Local)
	}

zhengchun avatar Jun 30 '23 14:06 zhengchun

Missing to consider attribute nodes prefix and Namespace URL.

You can use the below code to find a parent node node and then iterate over all its attribute values.

	node, _ := xmlquery.Query(root, "//node[@xlink:href]")
	for _, attr := range node.Attr {
		fmt.Println("NamespaceURI:", attr.NamespaceURI)
		fmt.Println("Prefix:", attr.Name.Space)
		fmt.Println("Data:", attr.Name.Local)
	}

Well, why not but if I'm doing so, I must first parse the xpath "//node[@xlink:href]" to extract the namespace (xlink) and prefix (href), and then loop over all the attributes to find the ones that match. It's not really efficient.

Anyway, thanks for your help @zhengchun: much appreciated.

fgateuil avatar Jul 02 '23 12:07 fgateuil