How to match an empty response body?
Discussed in https://github.com/projectdiscovery/httpx/discussions/1628
Originally posted by Yardanico March 13, 2024 Hi, I've been using httpx for quite a long time, and it's a really great project! Right now I'm struggling a bit to match empty response bodies. It looks to me that there's no DSL variable that I can use to get the actual length of the response body?
As far as I understand, content_length just takes Content-Length from response headers, and it's not defined if there's no such header present (I know that websites shouldn't respond like this, but sadly almost no one follows standards nowadays):
httpx-toolkit -u "redacted" -path "/wp-config.php" -filter-condition 'content_length > 0' -debug
__ __ __ _ __
/ /_ / /_/ /_____ | |/ /
/ __ \/ __/ __/ __ \| /
/ / / / /_/ /_/ /_/ / |
/_/ /_/\__/\__/ .___/_/|_|
/_/
projectdiscovery.io
[INF] Current httpx version v1.6.0 (latest)
[INF] Dumped HTTP request for https://redacted/wp-config.php
GET /wp-config.php HTTP/1.1
Host: redacted
User-Agent: Mozilla/5.0 (X11; CrOS x86_64 15232.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36
Accept-Charset: utf-8
Accept-Encoding: gzip
[INF] Dumped HTTP response for https://redacted/wp-config.php
HTTP/1.1 200 OK
Connection: close
Accept-Ranges: bytes
Age: 1874
Cache-Control: public
Content-Type: text/html; charset=UTF-8
Date: Wed, 13 Mar 2024 12:27:53 GMT
Referrer-Policy: strict-origin-when-cross-origin
Server: nginx
Strict-Transport-Security: max-age=15768000
Vary: Accept-Encoding
X-Cache: HIT
X-Cacheable: YES
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Xss-Protection: 1; mode=block
[ERR] Could not evaluate DSL expression: No parameter 'content_length' found.
https://redacted/wp-config.php
So yeah, maybe I missed something obvious, but - how do I filter such websites? I've tried checking the hash - body_md5 is also not present, but funnily enough httpx still gives out the hash (for an empty 0 byte string) in its output when asked:
httpx-toolkit -u "redacted" -hash md5 -path "/wp-config.php" -filter-condition 'contains(body_md5, "d41d8cd98f00b204e9800998ecf8427e")'
__ __ __ _ __
/ /_ / /_/ /_____ | |/ /
/ __ \/ __/ __/ __ \| /
/ / / / /_/ /_/ /_/ / |
/_/ /_/\__/\__/ .___/_/|_|
/_/
projectdiscovery.io
[INF] Current httpx version v1.6.0 (latest)
[ERR] Could not evaluate DSL expression: No parameter 'body_md5' found.
https://redacted/wp-config.php [d41d8cd98f00b204e9800998ecf8427e]
Both word and line count are 1 for a completely empty body, but they can also be 1 for legitimate short responses, so I can't use them to filter out completely empty responses.
I made a small debug Python server:
from http.server import BaseHTTPRequestHandler, HTTPServer
class SimpleHTTPRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
if self.path == '/nothing':
self.send_response(200)
self.end_headers()
elif self.path == '/nothing_cl0':
self.send_response(200)
self.send_header('Content-Length', '0')
self.end_headers()
elif self.path == '/nothing_cl1':
self.send_response(200)
self.send_header('Content-Length', '1')
self.end_headers()
def run(server_class=HTTPServer, handler_class=SimpleHTTPRequestHandler, port=8000):
server_address = ('', port)
httpd = server_class(server_address, handler_class)
print(f'Starting httpd on port {port}...')
httpd.serve_forever()
if __name__ == '__main__':
run()
And check it with (there will be some errors because httpx tries https first):
$ httpx-toolkit -u "localhost:8000" -path "/nothing_cl0" -filter-condition 'content_length < 1'
With /nothing and /nothing_cl0 the content_length variable isn't defined in the DSL, it's only defined for /nothing_cl1 and is equal to 1, even though the actual body is still empty. I guess the case with Content-Length: 0 being present without content_length is a bug, but my question still stands - how do I check the real response body length?
Originally posted by @Yardanico in https://github.com/projectdiscovery/httpx/discussions/1628#discussioncomment-8773146