gzip encoded http not decoding
First of all - great tool, thank you for building it.
I am having trouble with decoding gzip encoded http, I have built the latest version on Mac OS X, and from what I can see zlib was found correctly during the configure. When I run it in console mode, should it output the decoded content onto the console?
To test it I used this:
https://github.com/ksmith97/GzipSimpleHTTPServer
I created an index.html file with just an html tag with an empty head and body
Ran tcpflow like this:
sudo tcpflow -i lo0 -c -e http
this is the output:
tcpflow: listening on lo0
127.000.000.001.57183-127.000.000.001.08000: GET /index.html HTTP/1.1
Host: localhost:8000
Connection: keep-alive
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.84 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
DNT: 1
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
If-Modified-Since: Sat, 11 Jun 2016 20:53:30 GMT
127.000.000.001.08000-127.000.000.001.57183: HTTP/1.0 200 OK
127.000.000.001.08000-127.000.000.001.57183: Server: SimpleHTTP/0.6 Python/2.7.11
127.000.000.001.08000-127.000.000.001.57183: Date: Sat, 11 Jun 2016 21:06:29 GMT
127.000.000.001.08000-127.000.000.001.57183: Content-type: text/html
127.000.000.001.08000-127.000.000.001.57183: Content-Encoding: gzip
127.000.000.001.08000-127.000.000.001.57183: Content-Length: 49
127.000.000.001.08000-127.000.000.001.57183: Last-Modified: Sat, 11 Jun 2016 20:53:30 GMT
127.000.000.001.08000-127.000.000.001.57183:
127.000.000.001.08000-127.000.000.001.57183: ....U}\W....(.......HML...0FR~J%X...J.U...7n.1...
I observe the same with 1.4.5. I wonder if this is a bug or we just failed to tell tcpflow to do it...
I'm not sure. There is a regression test that it passes; can you provide me with a set of packets that do not properly gunzip?
On Nov 6, 2017, at 10:42 AM, Oliver Gondža [email protected] wrote:
I observe the same with 1.4.5. I wonder if this is a bug or we just failed to tell tcpflow to do it...
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/simsong/tcpflow/issues/121#issuecomment-342187851, or mute the thread https://github.com/notifications/unsubscribe-auth/ABhTrFAfXHEO-DHBMRLCBxaxlMcNCUueks5szyjwgaJpZM4IzpIk.
When I run curl -vL -H 'Accept-Encoding: gzip' http://abclinuxu.cz while sudo tcpflow -s -c -i any host abclinuxu.cz. This is what tcpflow sniffs:
171.025.221.158.00080-010.040.002.211.33610: HTTP/1.1 200 OK
Server: nginx
Date: Tue, 07 Nov 2017 08:07:00 GMT
Content-Type: text/html;charset=UTF-8
Content-Length: 20382
Connection: keep-alive
Set-Cookie: JSESSIONID=7t9xp6e0a5yragwrqcc4kc4g;Path=/;HttpOnly
Last-Modified: Tue, 07 Nov 2017 08:07:00 GMT
Expires: Fri, 22 Dec 2000 05:00:00 GMT
Cache-Control: no-cache, must-revalidate
Pragma: no-cache
Content-Encoding: gzip
XgT91f@*][iv*~h^_eY$I=&u&H_j{te{kf[3L4GYvO6K]}gj6GY([th_}8x_sJf`my@A4:s$W>zZ>yp\9|]mn*?QmT56F%VdH}|M\ow(/hoq/b^|V".E]TKEoZ%l=]z
!#)|+$)|+PaQA6"^L6ot8?~F,hl@<x-r;:e$Ic xo!:U,y+ K[i0OG#m;[C'Y!HT:A)
...
[Binary garbage continues]
If you can provide me with a packet dump, I will review it.
On Nov 7, 2017, at 3:10 AM, Oliver Gondža [email protected] wrote:
When I run curl -vL -H 'Accept-Encoding: gzip' http://abclinuxu.cz while sudo tcpflow -s -c -i any host abclinuxu.cz. This is what tcpflow sniffs:
171.025.221.158.00080-010.040.002.211.33610: HTTP/1.1 200 OK Server: nginx Date: Tue, 07 Nov 2017 08:07:00 GMT Content-Type: text/html;charset=UTF-8 Content-Length: 20382 Connection: keep-alive Set-Cookie: JSESSIONID=7t9xp6e0a5yragwrqcc4kc4g;Path=/;HttpOnly Last-Modified: Tue, 07 Nov 2017 08:07:00 GMT Expires: Fri, 22 Dec 2000 05:00:00 GMT Cache-Control: no-cache, must-revalidate Pragma: no-cache Content-Encoding: gzip
XgT91f@][iv~h^eY$I=&u&H_j{te{kf[3L4GYvO6K]}gj6GY([th}8x_sJf`my@A4:s$W>zZ>yp\9|]mn*?QmT56F%VdH}|M\ow(/hoq/b^|V".E]TKEoZ%l=]z !#)|+$)|+PaQA6"^L6ot8?~F,hl@<x-r;:e$Ic xo!:U,y+ K[i0OG#m;[C'Y!HT:A) ... [Binary garbage continues] — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/simsong/tcpflow/issues/121#issuecomment-342405977, or mute the thread https://github.com/notifications/unsubscribe-auth/ABhTrH_mNpRjUZ4nGQaq5cOR-HzUzKbnks5s0BBcgaJpZM4IzpIk.
I am not sure what you mean by packet dump. Here is the output captured without any pretty-printing opts - I have verified the content can be read by gzip -d (with gzip: stdin: unexpected end of file, though): https://gist.github.com/olivergondza/aed85ef7e46b86693bdc4bfb82d65386#file-gziped
I want you to give me a pcap file.
Sent from my phone.
On Nov 7, 2017, at 11:51 PM, Oliver Gondža [email protected] wrote:
I am not sure what you mean by packet dump. Here is the output captured without any pretty-printing opts - I have verified the content can be read by gzip -d (with gzip: stdin: unexpected end of file, though): https://gist.github.com/olivergondza/aed85ef7e46b86693bdc4bfb82d65386#file-gziped
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
pcap it is: https://gist.github.com/olivergondza/aed85ef7e46b86693bdc4bfb82d65386#file-gzip-pcap
I'm having the same problem. I'm trying to use tcpflow to follow a REST API. Here's a pcap of a very simple request + response. It's only 7 packets request.zip
tcpflow -r request.pcap -c -a -g 052.043.158.010.55634-172.031.021.034.00080: POST /PrototypeAppServlet HTTP/1.1 Content-Type: application/json Content-Length: 172 Host: prototypeapp.jme5spybzc.us-west-2.elasticbeanstalk.com Connection: Keep-Alive User-Agent: Apache-HttpClient/4.5.2 (Java/1.8.0_121) Accept-Encoding: gzip,deflate
{"methodName":"getData","requestInfo":{"customerId":753},"interfaceName":"CustomerRecordingEntry","userRole":"Customer","userId":"[email protected]","userCustomerId":753} 172.031.021.034.00080-052.043.158.010.55634: HTTP/1.1 200 OK Server: nginx/1.10.2 Date: Fri, 17 Nov 2017 22:57:10 GMT Content-Type: application/json;charset=UTF-8 Transfer-Encoding: chunked Connection: keep-alive Access-Control-Allow-Origin: * Content-Encoding: gzip
5b VJ-.+NKWVJI,ITQM-.NLOURWp+(**($+YrK)bH 0
Much better. Thanks. I’ll take a look.
(Sent from my laptop.)
Simson L. Garfinkel https://simson.net/ 202-649-0029
On Nov 17, 2017, at 6:16 PM, Matthew Fulmer [email protected] wrote:
Here's a simpler pcap file of a very short request/response with only 7 packets request.zip https://github.com/simsong/tcpflow/files/1484113/request.zip tcpflow -r request.pcap -c -a -g 052.043.158.010.55634-172.031.021.034.00080: POST /PrototypeAppServlet HTTP/1.1 Content-Type: application/json Content-Length: 172 Host: prototypeapp.jme5spybzc.us-west-2.elasticbeanstalk.com Connection: Keep-Alive User-Agent: Apache-HttpClient/4.5.2 (Java/1.8.0_121) Accept-Encoding: gzip,deflate
{"methodName":"getData","requestInfo":{"customerId":753},"interfaceName":"CustomerRecordingEntry","userRole":"Customer","userId":"[email protected] mailto:[email protected]","userCustomerId":753} 172.031.021.034.00080-052.043.158.010.55634: HTTP/1.1 200 OK Server: nginx/1.10.2 Date: Fri, 17 Nov 2017 22:57:10 GMT Content-Type: application/json;charset=UTF-8 Transfer-Encoding: chunked Connection: keep-alive Access-Control-Allow-Origin: * Content-Encoding: gzip
5b VJ-.+NKWVJI,ITQM-.NLOURWp+(**($+YrK)bH 0
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/simsong/tcpflow/issues/121#issuecomment-345391331, or mute the thread https://github.com/notifications/unsubscribe-auth/ABhTrKIglzlNvSbESS0__pP77LJoNQ7Hks5s3hO1gaJpZM4IzpIk.
I have the same problems too!
Still unresolved... Same problem here
I won’t be able to get to this for a while.
Sent from my phone.
On Oct 27, 2018, at 4:44 AM, Sergey F. [email protected] wrote:
Still unresolved... Same problem here
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Same problem. Here is my log.
wget - q https://www.dropbox.com/s/ma6cscyy5wjcd1n/image2.log
I captured it using Wireshark. Note the log contains an image. otherwise, the decompress function works.
I'm in the process of doing a complete rewrite of the be13_api that's used by both tcpflow and bulk_extractor. This is an important issue, but the rewrite is more important. You are welcome to submit a patch, or have one of your students work on it as an exercise. Unfortunately, that's all I can offer at the moment.
Meanwhile, is it okay if I download your log and add it to the set of unit-tests?
Meanwhile, is it okay if I download your log and add it to the set of unit-tests?
Sure. That is what I created for testing.
The HTML and embedded image are here
wget -q https://www.dropbox.com/s/7pkkduka6014uko/image.html wget -q https://www.dropbox.com/s/yb4kvvr1w2scikp/building_20201108_221645.jpg
Thanks again. I'll get to this when the be13_api rewrite is finished. I'll be making tcpflow work with the rewrite before bulk_extractor, as it's a simpler program. The whole system is being updated to C++17 and there will be code coverage of the unit tests, and the unit tests are using a standard unit test framework. It's a lot of work, but i'm learning a lot more about C++ and how it's changed over the past 20 years.
FWIW it seems this only happens w/ -c, the body is correctly decoded when written to an HTTPBODY file, at least for me using 1.5.1.
Not working for me with 1.5.1.
Content-Encoding: gzip and response is displayed not decoded with -c or without. :man_shrugging:
Not working for me with 1.5.1.
Content-Encoding: gzipand response is displayed not decoded with-cor without. 🤷♂️
Thanks for the report. Nobody has worked on it, so it is not surprising it does not work. Do you want to try to give it a try?