xpath
xpath copied to clipboard
Potential bug with getting parent node ( /.. )
Steps to reproduce: save the content of view-source:https://www.productfrom.com/product/416492-adidas-copa-mundial-soccer-shoes
in test.html
package main
import (
"fmt"
"github.com/antchfx/htmlquery"
)
func main() {
doc, err := htmlquery.LoadDoc("test.html")
if err != nil {
panic(err)
}
product := htmlquery.FindOne(doc, "//div[contains(@class, 'grid grid-cols-12 gap-0 border-t py-2 px-4')]//div[contains(.,'Product Name')]/..//div[contains(@class, 'text:right')]//span")
if product != nil {
fmt.Println("product:", htmlquery.InnerText(product))
}
}
When I test the xpath expression online, e.g. here: https://htmlstrip.com/xpath-tester, then it finds a match using this expression: //div[contains(@class, 'grid grid-cols-12 gap-0 border-t py-2 px-4')]//div[contains(.,'Product Name')]/..//div[contains(@class, ' text:right')]//span
.
Your Go program will work if you change "text:right" in your XPath expression to "test-right":
$ curl -o test.html https://www.productfrom.com/product/416492-adidas-copa-mundial-soccer-shoes
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 37512 0 37512 0 0 68327 0 --:--:-- --:--:-- --:--:-- 68327
$ cat main.go
package main
import (
"fmt"
"github.com/antchfx/htmlquery"
)
func main() {
doc, err := htmlquery.LoadDoc("test.html")
if err != nil {
panic(err)
}
product := htmlquery.FindOne(doc, "//div[contains(@class, 'grid grid-cols-12 gap-0 border-t py-2 px-4')]//div[contains(.,'Product Name')]/..//div[contains(@class, 'text:right')]//span")
if product != nil {
fmt.Println("product:", htmlquery.InnerText(product))
}
}
$ diff main.go main-typo-fixed.go
14c14
< product := htmlquery.FindOne(doc, "//div[contains(@class, 'grid grid-cols-12 gap-0 border-t py-2 px-4')]//div[contains(.,'Product Name')]/..//div[contains(@class, 'text:right')]//span")
---
> product := htmlquery.FindOne(doc, "//div[contains(@class, 'grid grid-cols-12 gap-0 border-t py-2 px-4')]//div[contains(.,'Product Name')]/..//div[contains(@class, 'text-right')]//span")
$ go run main.go
$ go run main-typo-fixed.go
product: Adidas COPA MUNDIAL soccer shoes
Not sure why your original XPath expression with the colon in the class name works for https://htmlstrip.com/xpath-tester, but it does not work in Chrome dev tools console. There again if you change "text:right" to "text-right" you will get the correct element:
Hello, I checked your give URL: view-source:https://www.productfrom.com/product/416492-adidas-copa-mundial-soccer-shoes
, there is no any text:right
characters in HTML source code ,only have text-right
.