nuclei icon indicating copy to clipboard operation
nuclei copied to clipboard

add strict probe option for httpx

Open dogancanbakir opened this issue 5 months ago • 11 comments

Proposed changes

closes https://github.com/projectdiscovery/nuclei/issues/6651

Checklist

  • [x] Pull request is created against the dev branch
  • [ ] All checks passed (lint, unit/integration/regression tests etc.) with my changes
  • [ ] I have added tests that prove my fix is effective or that my feature works
  • [ ] I have added necessary documentation (if appropriate)

Summary by CodeRabbit

New Features

  • Added --strict-probe (-sp) CLI flag to enforce strict HTTP probe mode. When enabled, scanning stops if the HTTP probe returns zero URLs, preventing fallback to raw input.

✏️ Tip: You can customize this high-level summary in your review settings.

dogancanbakir avatar Dec 03 '25 04:12 dogancanbakir

Walkthrough

A new strict probe feature is added to Nuclei via a --strict-probe CLI flag. When enabled, the scanner stops immediately if the internal httpx probe returns zero URLs, skipping fallback to raw input scanning. The feature involves adding an Options field, modifying the HTTP input initialization to return URL counts, and implementing early-exit logic in the enumeration runner.

Changes

Cohort / File(s) Summary
Strict Probe Feature – Configuration & CLI
cmd/nuclei/main.go, pkg/types/types.go
Added StrictProbe boolean field to Options struct. Wired new --strict-probe / -sp CLI flag in optimization flag group, defaulting to false.
Strict Probe Feature – HTTP Input Processing
internal/runner/inputs.go, internal/runner/runner.go
Modified initializeTemplatesHTTPInput() method signature to return an additional int32 URL count. Updated RunEnumeration() to check StrictProbe flag and urlCount; if both indicate no URLs found, logs and returns early without scanning.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

  • Verify that the Copy() method in Options intentionally omits StrictProbe (field will default to false in clones)
  • Confirm early-exit logic in RunEnumeration() correctly prevents further scanning when StrictProbe is enabled and probe finds zero URLs
  • Check that all return paths in initializeTemplatesHTTPInput() properly include the int32 count value

Poem

🐰 A rabbit's flag, so strict and wise,
No fallback schemes, no false surprise!
When httpx finds naught but empty space,
We hop right past, no time to waste.
Swift and clean, the probe stands tall—
Zero results? We skip it all! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'add strict probe option for httpx' directly and clearly describes the main change—adding a new command-line flag/option for strict probing behavior.
Linked Issues check ✅ Passed The PR implements the core requirement from issue #6651: a --strict-probe flag that stops scanning when httpx probe returns zero URLs, avoiding fallback to raw input.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing the strict probe feature: adding the StrictProbe field to Options, wiring it to CLI flags, and integrating the logic into the HTTP input initialization flow.
✨ Finishing touches
  • [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment
  • [ ] Commit unit tests in branch 6651_add_strict_probe_option

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Dec 03 '25 04:12 coderabbitai[bot]

@dwisiswant0 This optional flag allows users to decide whether to keep scanning the specified target, based on the output of the httpx integration as stated. We might need to make it a bit more explicit, maybe -httpx-strict-probe or something similar, but this is not the default behavior.

dogancanbakir avatar Dec 03 '25 07:12 dogancanbakir

@dogancanbakir - what I was trying to say is, even if a target cannot be probed by httpx, it should still remain eligible for scanning through other non-HTTP protocols. So the inability to probe via httpx shouldn't automatically exclude it from the rest of the non-HTTP-protocol-based templates.

dwisiswant0 avatar Dec 03 '25 09:12 dwisiswant0

and

By introducing -strict-probe, we would effectively enforce that all inputs MUST be valid HTTP/S targets by default

By "default", I mean the flag's standard behavior when users enable it.

dwisiswant0 avatar Dec 03 '25 09:12 dwisiswant0

Proposal:

However, Nuclei currently triggers an Auto Fallback mechanism. It ignores the failed probe and proceeds to scan the target using the raw input, assuming it might be a valid HTTP service. In scenarios with large attack surfaces containing many non-HTTP open ports, this causes Nuclei to waste significant time sending thousands of HTTP requests to non-HTTP ports

The way I see that, with -strict-probe, targets that fail httpx probing should be excluded specifically from HTTP-protocol-based templates only, while remaining eligible for all non-HTTP-protocol scans.

But, additionally, that approach would end up skipping any HTTP-protocol-based template that overrides its own host & port settings.

$ grep -Pnr "\{\{Host(name)?\}\}:[0-9]{1,5}" http/
http/vulnerabilities/unifi/unifi-nfc-credentials.yaml:20:        @Host: {{Hostname}}:9780
http/vulnerabilities/unifi/unifi-create-user.yaml:23:        @Host: {{Hostname}}:9780
http/cves/2025/CVE-2025-52665.yaml:55:        @Host: {{Hostname}}:9780
http/cves/2019/CVE-2019-8451.yaml:41:      url=https://{{Host}}:443@{{interactsh-url}}
http/cves/2024/CVE-2024-6396.yaml:54:        @Host: http://{{Host}}:43800
http/cves/2024/CVE-2024-6396.yaml:56:        Host: {{Host}}:43800
http/cves/2018/CVE-2018-8024.yaml:33:      - "{{Host}}:4040/jobs/?\"'><script>alert(document.domain)</script>"
http/cves/2020/CVE-2020-9480.yaml:50:            "spark.master": "spark://{{Hostname}}:6066"

dwisiswant0 avatar Dec 03 '25 09:12 dwisiswant0

Proposal:

However, Nuclei currently triggers an Auto Fallback mechanism. It ignores the failed probe and proceeds to scan the target using the raw input, assuming it might be a valid HTTP service. In scenarios with large attack surfaces containing many non-HTTP open ports, this causes Nuclei to waste significant time sending thousands of HTTP requests to non-HTTP ports

The way I see that, with -strict-probe, targets that fail httpx probing should be excluded specifically from HTTP-protocol-based templates only, while remaining eligible for all non-HTTP-protocol scans.

But, additionally, that approach would end up skipping any HTTP-protocol-based template that overrides its own host & port settings.

$ grep -Pnr "\{\{Host(name)?\}\}:[0-9]{1,5}" http/
http/vulnerabilities/unifi/unifi-nfc-credentials.yaml:20:        @Host: {{Hostname}}:9780
http/vulnerabilities/unifi/unifi-create-user.yaml:23:        @Host: {{Hostname}}:9780
http/cves/2025/CVE-2025-52665.yaml:55:        @Host: {{Hostname}}:9780
http/cves/2019/CVE-2019-8451.yaml:41:      url=https://{{Host}}:443@{{interactsh-url}}
http/cves/2024/CVE-2024-6396.yaml:54:        @Host: http://{{Host}}:43800
http/cves/2024/CVE-2024-6396.yaml:56:        Host: {{Host}}:43800
http/cves/2018/CVE-2018-8024.yaml:33:      - "{{Host}}:4040/jobs/?\"'><script>alert(document.domain)</script>"
http/cves/2020/CVE-2020-9480.yaml:50:            "spark.master": "spark://{{Hostname}}:6066"

This is the desired behavior explicitly requested by the community user, and I don't see any issue with this, as long as we don't change nuclei's default behavior.

dogancanbakir avatar Dec 03 '25 12:12 dogancanbakir

Tagging the requester @sixteen250. I want to elaborate a bit more on the concern I'm trying to highlight, because the current behavior of the patch can lead to unintended side effects across the entire scanning flow. From what I understand, the logic effectively works like this:

target input (e.g., host:22)
       ↓
strict-probe applied
       ↓
httpx runs → returns 0 (no successful probe)
       ↓
the target is excluded from scanning (dns, javascript, network, etc.)

The issue here is that strict-probe ends up interpreting an HTTP(S)-level failure as a total host failure. In reality, the only thing that's unreachable is the HTTP protocol surface (typically port 80/443), but other protocols may still be perfectly valid and scannable.

This logic cascades too broadly, causing the scanner to skip ALL protocols simply because httpx cannot probe one or two ports. As a result, templates for DNS, Network, JavaScript, etc., protocols never get a chance to run, even though the host itself might still be very much alive on other ports/services.

tl;dr; This strict-probe behavior becomes overly authoritative: it concludes "the host is dead" rather than "the HTTP surface is dead". This distinction matters because the former shuts down the entire scan scope, while the latter should only limit HTTP-protocol-based templates.

That's the intent behind my concern, defining the boundaries of what strict-probe should and should not influence. Let me know what you think or if clarification is needed.

dwisiswant0 avatar Dec 03 '25 13:12 dwisiswant0

To make the scenario easier to visualize, let's use the example provided by the requester, @sixteen250, which uses port 3306 (MySQL).

If the user supplies a target like host:3306 and strict-probe is enabled, then under the current patch logic, that target will inevitably be dropped entirely. The reason is simple: httpx will try to probe it, fail (because 3306 is not an HTTP service), and return no results, which the strict-probe logic interprets as a signal to exclude the target from ALL templates.

But that behavior is incorrect, because while the target is not running an HTTP service, we DO have a number of templates that are specifically intended to scan MySQL on port 3306. Ex:

javascript/misconfiguration/mysql/mysql-empty-password.yaml
javascript/default-logins/mysql-default-login.yaml
javascript/enumeration/mysql/mysql-info.yaml
javascript/enumeration/mysql/mysql-show-variables.yaml
javascript/enumeration/mysql/mysql-user-enum.yaml
javascript/enumeration/mysql/mysql-show-databases.yaml

Those templates are explicitly designed to interact with MySQL, meaning they remain completely valid even if httpx cannot probe the host. So dropping the target solely because httpx fails ends up preventing a whole set of relevant, non-HTTP-protocol-based templates from running.

That's the core of the issue: strict-probe should restrict HTTP-protocol-based templates when httpx fails, and it should NOT disqualify the target from all other protocol templates that operate independently of HTTP.

dwisiswant0 avatar Dec 03 '25 14:12 dwisiswant0

Yeah i agree, disabling all templates if http is not detected would be overly strict, we instead we should still check other port templates / protocols etc cc @tarunKoyalwar

Ice3man543 avatar Dec 03 '25 15:12 Ice3man543

Thanks for looping me in @dwisiswant0.

Regarding the naming proposed by @dogancanbakir: I agree that renaming it to -httpx-strict-probe (or similar) makes it much more explicit that this flag controls the behavior of the httpx integration specifically.

Regarding the logic concern: I strongly agree with @dwisiswant0's assessment. My original issue was indeed about the inefficiency and timeouts caused by sending HTTP requests to non-HTTP ports. However, completely dropping the target if the probe fails would lead to missed vulnerabilities on non-Web services (like the MySQL example mentioned).

The ideal behavior for this flag would be: If httpx probe fails → Skip ONLY http protocol templates → Continue scanning with network/javascript/dns templates.

This would solve the timeout performance issue without sacrificing coverage for other services.

sixteen250 avatar Dec 04 '25 06:12 sixteen250

Thanks for the clarification @sixteen250. I got the impression that the desired behavior was the completely skip the target from the following statement:

I would like a new flag (e.g., -strict-probe or -no-fallback) that changes this behavior. If the internal httpx probe returns 0 URLs (meaning the target is confirmed not to be a Web Service), Nuclei should immediately stop processing that host and not fallback to raw input scanning.

Anyway, I'll make necessary changes to the PR.

Thanks @dwisiswant0 and all for the input!

dogancanbakir avatar Dec 04 '25 06:12 dogancanbakir