subfinder
subfinder copied to clipboard
[Issue] Logging using incorrect level when verbose is set up
Describe the bug Hello there!
I'm encountering a difficulty related to logging in subfinder, which seems to be connected to how gologger handles label and log level. In the piece of code provided below, it seems that whenever an event that includes a label value is logged, this value gets converted to log level during formatting. Is there a specific reasoning behind this design decision?
To illustrate, when I operate subfinder with verbose log level configuration gologger.DefaultLogger.SetMaxLevel(levels.LevelVerbose)
, I encounter numerous logs with a log level that my external logging system cannot process.
{"level":"dnsdumpster","msg":"rate-limited-proxy-108-177-77-100.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"code.l.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"captive.adservice.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"rate-limited-proxy-108-177-64-100.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"google-proxy-66-249-83-100.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"rate-limited-proxy-108-177-73-0.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"rate-limited-proxy-108-177-73-100.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"colab-sandbox.research.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"tensorboard-staging.cloud.google.com","timestamp":"2023-05-25T14:07:57+0000"}
{"level":"dnsdumpster","msg":"mail.google.com","timestamp":"2023-05-25T14:07:57+0000"}
Subfinder version 2.5.8
Complete command you used to reproduce this I'm not running the CLI but rather using the lib, but the code below should reproduce this bug
package main
import (
"bytes"
"fmt"
"io"
"log"
"github.com/projectdiscovery/subfinder/v2/pkg/resolve"
"github.com/projectdiscovery/subfinder/v2/pkg/runner"
"github.com/projectdiscovery/gologger"
)
func main() {
gologger.DefaultLogger.SetMaxLevel(levels.LevelVerbose)
gologger.DefaultLogger.SetFormatter(&formatter.JSON{})
runnerInstance, err := runner.NewRunner(&runner.Options{
Threads: 10, // Thread controls the number of threads to use for active enumerations
Timeout: 30, // Timeout is the seconds to wait for sources to respond
MaxEnumerationTime: 10, // MaxEnumerationTime is the maximum amount of time in mins to wait for enumeration
Resolvers: resolve.DefaultResolvers, // Use the default list of resolvers by marshaling it to the config
ResultCallback: func(s *resolve.HostEntry) { // Callback function to execute for available host
log.Println(s.Host, s.Source)
},
})
buf := bytes.Buffer{}
err = runnerInstance.EnumerateSingleDomain("projectdiscovery.io", []io.Writer{&buf})
if err != nil {
log.Fatal(err)
}
data, err := io.ReadAll(&buf)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%s", data)
}
You will then see multiple logs such as :
{"level":"hackertarget","msg":"nuclei.projectdiscovery.io","timestamp":"2023-05-26T10:17:26+0000"}
{"level":"hackertarget","msg":"chaos.projectdiscovery.io","timestamp":"2023-05-26T10:17:26+0000"}
{"level":"hackertarget","msg":"test.auth.projectdiscovery.io","timestamp":"2023-05-26T10:17:26+0000"}
{"level":"hackertarget","msg":"dev.auth.projectdiscovery.io","timestamp":"2023-05-26T10:17:26+0000"}
{"level":"alienvault","msg":"test.auth.projectdiscovery.io","timestamp":"2023-05-26T10:17:26+0000"}
The problem comes from those lines of code : https://github.com/projectdiscovery/subfinder/blob/5aca90d176393ae109f0fc0036b1cbb1b23b0012/v2/pkg/runner/enumerate.go#L69-L72
It seems like the Label() method overwrite the level of the log event object. However, here we just want to specify which source found the domain name (while still keeping the event log level at verbose
).
I think this could be easily fixed by using something like gologger.Verbose().Str("source", result.Source).Msg(subdomain)
@reivaxxavier , if i understood this correctly the issues is that gologger logging design is different that expected and is not compatible with your processing system
I encounter numerous logs with a log level that my external logging system cannot process.
the usage of format
gologger.Verbose().Label(result.Source).Msg(subdomain)
is common across all pd projects and i think this issue should be addressed as a feature request in gologger repo to make it more standard or option to change this specific behaviour
i see you have already created a issue for that at https://github.com/projectdiscovery/gologger/issues/38 . since this is not related to subfinder . i will be closing this issue and we can track the gologger feature request at https://github.com/projectdiscovery/gologger/issues/38
I think that this should be reopened.
the usage of format [...] is common across all pd projects
From what I see, this seems specific to subfinder.
- No label usage in httpx: https://github.com/search?q=repo%3Aprojectdiscovery%2Fhttpx+label%28&type=code
- In Nuclei, (except for a debug message), the label value is properly defined (respecting one of the allowed values, like
WRN
, defined by gologger: https://github.com/projectdiscovery/gologger/blob/1fc5289842cc77b4c81dfef7b775cf3b25f97cdf/gologger.go#L15-L22): https://github.com/search?q=repo%3Aprojectdiscovery%2Fnuclei%20label(&type=code
Reopening to revisit after https://github.com/projectdiscovery/gologger/issues/38 worked on.
Thanks for reopening this issue @ehsandeep!
I don't think that https://github.com/projectdiscovery/gologger/issues/38 needs to be fixed. Instead, an appropriate Label value should be given in the argument of the Label(...) function, when called by subfinder.
This is what https://github.com/projectdiscovery/subfinder/pull/848 intends to do.
I'm on the same page with gnuletik on this one. The trouble with gologger doesn't really have anything to do with the way subfinder uses the Label function.
The thing is, Label function should be given a 'level', not a 'source name'. So, it makes sense to sort this out in subfinder itself, rather than waiting for a fix from gologger. Because, even if gologger comes up with a fix, it's not going to change the fact that the Label usage in subfinder still needs to be fixed.
@xavier-gerondeau-stoik , @gnuletik the intention behind creating gologger issue is that we can add a option in gologger to strictly behave like a logger and do not accept if any unknown labels are given and if given wrap it (like golang error)
ex when that option is implemented
gologger.DefaultLogger.Label("securitytrails").Msg("api.hackerone.com")
this will return output in below format
[VER] securitytrails: api.hackerone.com
it might be just me but when i am using subfinder binary in verbose mode i like
[SecurityTrails] api.hackerone.com
^ more than
[VER] Found subdomain subdomain_source=virustotal subdomain=mta-sts.forwarding.hackerone.com
since its simple . and i can easily do some bash foo with that format
Thanks for the feedback @tarunKoyalwar!
Even if the following can be easier to read when running a CLI, the following example is not following any logging convention.
[SecurityTrails] api.hackerone.com
The key=value
pair is also quite easy to parse and defined as the Common Event Format.
[VER] Found subdomain subdomain_source=securitytrail subdomain=api.hackerone.com
This approach even has it full potential when you want to add other fields (we can imagine fields like confidence
, ip
etc..).
Also, having a static log message ("Found subdomain") helps when searching among multiple log messages (e.g we can search all "Found subdomain" messages and extract their subdomain
values).
But I can understand if you want to avoid using key=value
pair to improve readability in CLI mode.
Until gologger
adds a Label feature, what do you think about implementing the expected output with the following code?
gologger.DefaultLogger.Msg("securitytrails: api.hackerone.com")