httpx icon indicating copy to clipboard operation
httpx copied to clipboard

improve fqdn extraction from response body using parsers

Open tarunKoyalwar opened this issue 1 year ago • 0 comments

Please describe your feature request:

  • currently we use regex to extract potential domains and then apply some heuristic rules
  • this can furthur be improved by using actual parsers
    • html -> goquery
    • javascript -> goja ast parser (https://github.com/dop251/goja/tree/master/parser)
  • parsers allow us to filter out and locate contexts and then extract fqdns from those places ( ex: src , href attributes of html , strings literals of javscript etc)

Describe the use case of this feature:

  • reduced FP and FN

tarunKoyalwar avatar Jun 22 '24 19:06 tarunKoyalwar