Crow icon indicating copy to clipboard operation
Crow copied to clipboard

`CROW_CATCHALL_ROUTE` response is truncated

Open cschreib opened this issue 1 year ago • 4 comments

Currently on the master branch (b516efe0a2cce558a6121f29f0c586ec6462999e). I tried using CROW_CATCHALL_ROUTE to implement some custom routing logic myself. I cannot seem to get it to work; the resulting response always ends up empty. When I enable debug logs, I see that it encounters an error (the URL I requested was http://0.0.0.0:8001/):

(2023-06-25 13:45:02) [INFO    ] Crow/master server is running at http://0.0.0.0:8001 using 4 threads
(2023-06-25 13:45:02) [INFO    ] Call `app.loglevel(crow::LogLevel::Warning)` to hide Info level logs.
(2023-06-25 13:45:02) [DEBUG   ] 0x1eafdf0 {0} queue length: 1
(2023-06-25 13:45:03) [DEBUG   ] 0x1eca050 {1} queue length: 1
(2023-06-25 13:45:03) [DEBUG   ] 0x1ecac60 timer cancelled: 0x7fe194499610 0
(2023-06-25 13:45:03) [DEBUG   ] task_timer cancelled: 0x7fe194499610 0
(2023-06-25 13:45:03) [DEBUG   ] task_timer scheduled: 0x7fe194499610 1
(2023-06-25 13:45:03) [DEBUG   ] 0x1ecac60 timer added: 0x7fe194499610 1
(2023-06-25 13:45:03) [DEBUG   ] Cannot match rules /. 
(2023-06-25 13:45:03) [INFO    ] Response: 0x1ecac60 / 200 0
(2023-06-25 13:45:03) [DEBUG   ] 0x1ecac60 timer cancelled: 0x7fe194499610 1
(2023-06-25 13:45:03) [DEBUG   ] task_timer cancelled: 0x7fe194499610 1
(2023-06-25 13:45:03) [DEBUG   ] 0x1ecac60 from read(1) with description: "stream ended at an unexpected time"
(2023-06-25 13:45:03) [DEBUG   ] 0x1ecac60 from write(2)

Here's the curl request result:

$ curl -v http://0.0.0.0:8001/
*   Trying 0.0.0.0:8001...
* TCP_NODELAY set
* Connected to 0.0.0.0 (127.0.0.1) port 8001 (#0)
> GET / HTTP/1.1
> Host: 0.0.0.0:8001
> User-Agent: curl/7.68.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< X-Content-Type-Options: nosniff
< X-Frame-Options: DENY
< Strict-Transport-Security: max-age=63072000
* Connection #0 to host 0.0.0.0 left intact

Here's the code:

#include <crow.h>

int main() {
    crow::SimpleApp app;

    crow::logger::setLogLevel(crow::LogLevel::Debug);

    CROW_CATCHALL_ROUTE(app)
    ([](const crow::request& req) {
        crow::response res;
        res.add_header("Content-Type", "text/html");
        res.add_header("Access-Control-Allow-Origin", "*");
        res.add_header(
            "Content-Security-Policy", "script-src 'self' 'unsafe-eval' 'unsafe-inline'; "
                                       "default-src 'self' 'unsafe-inline' *.corentin.net");
        res.add_header("Strict-Transport-Security", "max-age=63072000");
        res.add_header("X-Frame-Options", "DENY");
        res.add_header("X-Content-Type-Options", "nosniff");
        res.body = "Not found";
        return res;
    });

    app.port(8001).multithreaded().run();
}

cschreib avatar Jun 25 '23 13:06 cschreib

Note that if the route is defined as just

    CROW_CATCHALL_ROUTE(app)([]() { return "Not found"; });

... then the body does get sent (very last line):

$ curl -v http://0.0.0.0:8001/
*   Trying 0.0.0.0:8001...
* TCP_NODELAY set
* Connected to 0.0.0.0 (127.0.0.1) port 8001 (#0)
> GET / HTTP/1.1
> Host: 0.0.0.0:8001
> User-Agent: curl/7.68.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Length: 9
< Server: Crow/master
< Date: Sun, 25 Jun 2023 13:47:37 GMT
< 
* Connection #0 to host 0.0.0.0 left intact
Not found

The error is still printed in the logs though. In the original message above, some of the headers were written, but not all. This suggests that Crow is able to write some of the content, but something is aborting sending all the data.

For reference, using a static route like

    CROW_ROUTE(app, "/")
    ([] {
        crow::response res;
        res.add_header("Content-Type", "text/html");
        res.add_header("Access-Control-Allow-Origin", "*");
        res.add_header(
            "Content-Security-Policy", "script-src 'self' 'unsafe-eval' 'unsafe-inline'; "
                                       "default-src 'self' 'unsafe-inline' *.corentin.net");
        res.add_header("Strict-Transport-Security", "max-age=63072000");
        res.add_header("X-Frame-Options", "DENY");
        res.add_header("X-Content-Type-Options", "nosniff");
        res.body = "Not found";
        return res;
    });

... results in this log:

(2023-06-25 13:52:20) [INFO    ] Crow/master server is running at http://0.0.0.0:8001 using 4 threads
(2023-06-25 13:52:20) [INFO    ] Call `app.loglevel(crow::LogLevel::Warning)` to hide Info level logs.
(2023-06-25 13:52:20) [DEBUG   ] 0xe6e6a0 {0} queue length: 1
(2023-06-25 13:52:21) [DEBUG   ] 0xe6de00 {1} queue length: 1
(2023-06-25 13:52:21) [DEBUG   ] 0xe88d60 timer cancelled: 0x7f8f0f992610 0
(2023-06-25 13:52:21) [DEBUG   ] task_timer cancelled: 0x7f8f0f992610 0
(2023-06-25 13:52:21) [DEBUG   ] task_timer scheduled: 0x7f8f0f992610 1
(2023-06-25 13:52:21) [DEBUG   ] 0xe88d60 timer added: 0x7f8f0f992610 1
(2023-06-25 13:52:21) [DEBUG   ] 0xe88d60 timer cancelled: 0x7f8f0f992610 1
(2023-06-25 13:52:21) [DEBUG   ] task_timer cancelled: 0x7f8f0f992610 1
(2023-06-25 13:52:21) [INFO    ] Request: 127.0.0.1:39568 0xe88d60 HTTP/1.1 GET /
(2023-06-25 13:52:21) [DEBUG   ] Matched rule '/' 1 / 2
(2023-06-25 13:52:21) [INFO    ] Response: 0xe88d60 / 200 0
(2023-06-25 13:52:21) [DEBUG   ] 0xe88d60 timer cancelled: 0x7f8f0f992610 1
(2023-06-25 13:52:21) [DEBUG   ] task_timer cancelled: 0x7f8f0f992610 1
(2023-06-25 13:52:21) [DEBUG   ] task_timer scheduled: 0x7f8f0f992610 2
(2023-06-25 13:52:21) [DEBUG   ] 0xe88d60 timer added: 0x7f8f0f992610 2
(2023-06-25 13:52:21) [DEBUG   ] 0xe88d60 timer cancelled: 0x7f8f0f992610 2
(2023-06-25 13:52:21) [DEBUG   ] task_timer cancelled: 0x7f8f0f992610 2
(2023-06-25 13:52:21) [DEBUG   ] 0xe88d60 from read(1) with description: "success"

... and this curl output:

$ curl -v http://0.0.0.0:8001/
*   Trying 0.0.0.0:8001...
* TCP_NODELAY set
* Connected to 0.0.0.0 (127.0.0.1) port 8001 (#0)
> GET / HTTP/1.1
> Host: 0.0.0.0:8001
> User-Agent: curl/7.68.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< X-Frame-Options: DENY
< Strict-Transport-Security: max-age=63072000
< X-Content-Type-Options: nosniff
< Content-Security-Policy: script-src 'self' 'unsafe-eval' 'unsafe-inline'; default-src 'self' 'unsafe-inline' *.corentin.net
< Access-Control-Allow-Origin: *
< Content-Type: text/html
< Content-Length: 9
< Server: Crow/master
< Date: Sun, 25 Jun 2023 13:52:21 GMT
< Connection: Keep-Alive
< 
* Connection #0 to host 0.0.0.0 left intact
Not found

cschreib avatar Jun 25 '23 13:06 cschreib

I tried comparing the behavior of Crow in both cases (catch-all and static route) to understand the difference. First, I noticed that the catch-all handler is called very early, in Connection::handle_url(). At this stage, the HTTP parser hasn't consumed the rest of the request, so the crow::Request object is very incomplete (missing headers, remote IP address, body, etc.). The documentations doesn't mention this limitation, so I assume this is not intentional.

cschreib avatar Jun 25 '23 14:06 cschreib

The problem isn't the incomplete request, however. I think it has to do, somehow, with the use of asynchronous operations here in both do_write() and do_read(): https://github.com/CrowCpp/Crow/blob/13a91a1941fbabfc289dddcdeab08b80193f7c6c/include/crow/http_connection.h#L424-L430

If I add some debugging commands to print the bytes_processed and the error code message for the read and write callbacks, I get the following when the issue happens:

(2023-06-25 16:26:28) [DEBUG   ] 0x62200000f110 from read(0) bytes: 79, code: Success
(2023-06-25 16:26:28) [DEBUG   ] 0x62200000f110 from read(1) with description: "stream ended at an unexpected time"
(2023-06-25 16:26:28) [DEBUG   ] 0x62200000f110 from write(0) bytes: 241, code: Bad file descriptor
(2023-06-25 16:26:28) [DEBUG   ] 0x62200000f110 from write(2)

I.e., the read callback is called before the write callback. If I modify do_write() and do_read() to use blocking calls to ASIO, then the issue disappears, and the logs become:

(2023-06-25 16:23:39) [DEBUG   ] 0x62200000f110 from write(0) bytes: 388, code: Success
(2023-06-25 16:23:39) [DEBUG   ] 0x62200000f110 from read(0) bytes: 79, code: Success
(2023-06-25 16:23:39) [DEBUG   ] 0x62200000f110 from read(1) with description: "stream ended at an unexpected time"

The error "stream ended at an unexpected time" is still there (so that might have been a red herring), but everything else seems to be OK; I get the full response back whether I use the static route or the catch-all route.

Everything also seems to be OK if I just change do_write() to be blocking, and leave do_read() as async. I don't have a good enough understanding of Crow to figure out if they should both be blocking.

cschreib avatar Jun 25 '23 16:06 cschreib

For reference, here's the code I used for the synchronous do_write(): https://github.com/CrowCpp/Crow/commit/45d53978a4c038ce7c04bc5cfb2c0a0d4c3cfbca, and here's the synchronous do_read(): https://github.com/CrowCpp/Crow/commit/1dfc42c34cd0e55adc415869ac368d779125b55f.

cschreib avatar Jun 25 '23 16:06 cschreib