disable dedupe in response file write when `-sd` is used
httpx can avoid overwriting response file content for the same input when -skip-dedupe is used, as the user has explicitly disabled deduplication and wants to see each copy of the response.
Input:
$ cat test.txt
example.com
example.com
httpx run
$ httpx -l test.txt -stream -skip-dedupe -sr -silent
https://example.com
https://example.com
Current behavior:
$ tree output/response
output/response
├── example.com
│ └── cea8d4cbc5e3b39fcbcf053e0e0244fe14c835ae.txt
└── index.txt
1 directory, 2 files
Expected behavior:
$ tree output/response
output/response
├── example.com
│ └── cea8d4cbc5e3b39fcbcf053e0e0244fe14c835ae.txt
+├── example.com
+│ └── cea8d4cbc5e3b39fcbcf053e0e0244fe14c835ae.txt
└── index.txt
1 directory, 3 files
Question: how can you have two example.com directories (same name)? Don't we need something to differentiate?
Hi team 👋
I'm looking into contributing a fix for a case where response files get overwritten when scanning the same domain multiple times.
the response path is determined using:
domainFile := resp.Method + ":" + URL.EscapedString()
hash := hashes.Sha1([]byte(domainFile))
domainResponseFile := fmt.Sprintf("%s.txt", hash)
responseBaseDir := filepath.Join(..., hostFilename)
responsePath := filepath.Join(responseBaseDir, domainResponseFile)
This results in the same response file (
Would it make sense to append an incrementing suffix like:
localhost_8000/59bd76...fe3b_1.txt
localhost_8000/59bd76...fe3b_2.txt
to avoid overwriting and allow storing multiple responses for the same domain and path?
Or is there a preferred way to handle this use case?
Thanks!
incrementing suffix like
this sounds like good idea @jjhwan-h
Hi, I have a quick question while reviewing this issue.
I noticed that in both runner.go:1097 and runner.go:2154, the same responsePath appears to be written with the same data.
(Specifically, RunEnumeration calls process, which in turn calls analyze, where the write operations occur.)
Is there a particular reason for writing the same response to disk twice, or might this be an unintentional redundancy? @ehsandeep
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions!
This issue has been automatically closed due to inactivity. If you think this is a mistake or would like to continue the discussion, please comment or feel free to reopen it.