httpx icon indicating copy to clipboard operation
httpx copied to clipboard

How to match an empty response body?

Open dogancanbakir opened this issue 1 year ago • 1 comments

Discussed in https://github.com/projectdiscovery/httpx/discussions/1628

Originally posted by Yardanico March 13, 2024 Hi, I've been using httpx for quite a long time, and it's a really great project! Right now I'm struggling a bit to match empty response bodies. It looks to me that there's no DSL variable that I can use to get the actual length of the response body?

As far as I understand, content_length just takes Content-Length from response headers, and it's not defined if there's no such header present (I know that websites shouldn't respond like this, but sadly almost no one follows standards nowadays):

httpx-toolkit -u "redacted" -path "/wp-config.php" -filter-condition 'content_length > 0' -debug

    __    __  __       _  __
   / /_  / /_/ /_____ | |/ /
  / __ \/ __/ __/ __ \|   /
 / / / / /_/ /_/ /_/ /   |
/_/ /_/\__/\__/ .___/_/|_|
             /_/

                projectdiscovery.io

[INF] Current httpx version v1.6.0 (latest)
[INF] Dumped HTTP request for https://redacted/wp-config.php

GET /wp-config.php HTTP/1.1
Host: redacted
User-Agent: Mozilla/5.0 (X11; CrOS x86_64 15232.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36
Accept-Charset: utf-8
Accept-Encoding: gzip

[INF] Dumped HTTP response for https://redacted/wp-config.php

HTTP/1.1 200 OK
Connection: close
Accept-Ranges: bytes
Age: 1874
Cache-Control: public
Content-Type: text/html; charset=UTF-8
Date: Wed, 13 Mar 2024 12:27:53 GMT
Referrer-Policy: strict-origin-when-cross-origin
Server: nginx
Strict-Transport-Security: max-age=15768000
Vary: Accept-Encoding
X-Cache: HIT
X-Cacheable: YES
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 1; mode=block

[ERR] Could not evaluate DSL expression: No parameter 'content_length' found.
https://redacted/wp-config.php

So yeah, maybe I missed something obvious, but - how do I filter such websites? I've tried checking the hash - body_md5 is also not present, but funnily enough httpx still gives out the hash (for an empty 0 byte string) in its output when asked:

httpx-toolkit -u "redacted" -hash md5 -path "/wp-config.php" -filter-condition 'contains(body_md5, "d41d8cd98f00b204e9800998ecf8427e")'

    __    __  __       _  __
   / /_  / /_/ /_____ | |/ /
  / __ \/ __/ __/ __ \|   /
 / / / / /_/ /_/ /_/ /   |
/_/ /_/\__/\__/ .___/_/|_|
             /_/

                projectdiscovery.io

[INF] Current httpx version v1.6.0 (latest)
[ERR] Could not evaluate DSL expression: No parameter 'body_md5' found.
https://redacted/wp-config.php [d41d8cd98f00b204e9800998ecf8427e]

Both word and line count are 1 for a completely empty body, but they can also be 1 for legitimate short responses, so I can't use them to filter out completely empty responses.

dogancanbakir avatar Mar 13 '24 13:03 dogancanbakir

I made a small debug Python server:

from http.server import BaseHTTPRequestHandler, HTTPServer

class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        if self.path == '/nothing':
            self.send_response(200)
            self.end_headers()
        elif self.path == '/nothing_cl0':
            self.send_response(200)
            self.send_header('Content-Length', '0')
            self.end_headers()
        elif self.path == '/nothing_cl1':
            self.send_response(200)
            self.send_header('Content-Length', '1')
            self.end_headers()

def run(server_class=HTTPServer, handler_class=SimpleHTTPRequestHandler, port=8000):
    server_address = ('', port)
    httpd = server_class(server_address, handler_class)
    print(f'Starting httpd on port {port}...')
    httpd.serve_forever()

if __name__ == '__main__':
    run()

And check it with (there will be some errors because httpx tries https first):

$ httpx-toolkit -u "localhost:8000" -path "/nothing_cl0" -filter-condition 'content_length < 1'

With /nothing and /nothing_cl0 the content_length variable isn't defined in the DSL, it's only defined for /nothing_cl1 and is equal to 1, even though the actual body is still empty. I guess the case with Content-Length: 0 being present without content_length is a bug, but my question still stands - how do I check the real response body length?

Originally posted by @Yardanico in https://github.com/projectdiscovery/httpx/discussions/1628#discussioncomment-8773146

dogancanbakir avatar Mar 13 '24 14:03 dogancanbakir