htmlquery icon indicating copy to clipboard operation
htmlquery copied to clipboard

Use an interface for LRU cache

Open JWAlberty opened this issue 1 year ago • 1 comments

We're leveraging xpath directly to validate our xquery as well as using htmlquery to do our actual xpath selection. We're using a LRU cache for our direct xpath usage but this means we have one cache for validation and another for htmlquery.

This PR exposes that cache as an interface which would allow us to not only share the cache but provide alternative cache implementations as well. Performance and behavior remains the same as the current implementation but now the cache is testable, not only that but we can provide null cache operations to improve testability and pre-primed caches.

This small change makes htmlquery a lot more testable and transparent.

JWAlberty avatar Dec 29 '23 20:12 JWAlberty

Thanks for your PR.

I have another solution that without change any code.

First, disable the htmlquery's caching via DisableSelectorCache = true

Next, using Expr.Select()(https://pkg.go.dev/github.com/antchfx/xpath#Expr.Select) instead of htmlquery.Query(), this method you can continue using your LRU cache and caching Expr for the next use.

if exp, ok := cache.Get(key); !ok {
  exp, _ = xpath.Compile("selector)
  cache.Add(key,exp)
}
iter := exp.Select(htmlquery_doc)
while iter.MoveNext(){
  // put into the list
}
return list

What do you think?

zhengchun avatar Dec 30 '23 13:12 zhengchun