htmlquery icon indicating copy to clipboard operation
htmlquery copied to clipboard

Xpath position function not working properly

Open carlows opened this issue 3 years ago • 3 comments

Hi there,

First of all, thank you for the packages, they're very useful 🚀

I've been having issues with the position function and I'm not sure if it's an issue with the htmlquery package or the xpath package, here's an example:

const htmlSample = `<!DOCTYPE html><html lang="en-US">
<head>
<title>Hello,World!</title>
</head>
<body>
<div class="test">
	<a href="/test1">Test 1</a>
</div>
<div class="test">
	<a href="/test2">Test 1</a>
</div>
<div class="test">
	<a href="/test3">Test 1</a>
</div>
</body>
</html>
`

func TestXPath(t *testing.T) {
	list := Find(testDoc, "//div[@class=\"test\" and position()=1]//a/@href")
	for _, n := range list {
		fmt.Println(InnerText(n))
	}
}

I would expect this to filter all the nodes that have the class test and have a position == 1, so only the first <a /> element. But instead, I get all the nodes. If I try position()=2 I get nothing back.

If I instead use this xpath, it gives me the correct element:

//div[@class=\"test\"][2]//a/@href

If I try this on the browser it works, so I'm not sure if it is expected that it works this way here 🤔.

What could be the problem? Thank you again!

carlows avatar Mar 02 '21 00:03 carlows

thanks for report. it is a bug of position() in logical operation.

zhengchun avatar Mar 02 '21 15:03 zhengchun

Maybe I can help debugging it and open a PR, do you have an idea of where to look in the code?

carlows avatar Mar 02 '21 20:03 carlows

Sorry, it was not about position() bug, I found if change expr to this list := htmlquery.Find(doc, "//div[position()=1 and @class=\"test\"]/a/@href") can works. I guess build.processNode() have some bug.

If you are interesting, you can start at func (l *logicalQuery) Evaluate(t iterator) interface{} {...} to debug.

zhengchun avatar Mar 03 '21 12:03 zhengchun