symfony icon indicating copy to clipboard operation
symfony copied to clipboard

StreamedJsonResponse does not stream

Open macropay-solutions opened this issue 7 months ago • 32 comments

Hi,

We implemented a StreamedJsonResponse for large sets of data and we noticed it crashes with memory issues because the json_encode can't be used for streaming data.

What alternatives exist to this issue?

If it can't stream data, why is it called Streamed?

Thank you.

Originally posted by @macropay-solutions in https://github.com/symfony/symfony/discussions/60252

macropay-solutions avatar Apr 23 '25 13:04 macropay-solutions

Hey @macropay-solutions,

AFAIK, the StreamedJsonResponse does stream data. But to do so, it needs to iterates over an iterable. This means that if your data is a first-level iterable, it should be streamed properly. However, if it is nested in an object, for example, the stream will only consist of one chunk (and therefore won't be streamed).

If you're in that second case, you might want to take a look at the upcoming and experimental JsonStreamer component, which will be released in Symfony 7.3

mtarld avatar Apr 23 '25 16:04 mtarld

@mtarld we tried giving it directly the iterator (LazyCollection), not an array with an iterator in it. In both cases we received memory error on 3.7 mil rows for example.

Update

Description of the behaviour: https://github.com/laravel/framework/discussions/55509#discussioncomment-12912014

Update:

With the LazyColection as first param the result looks like this:

[
    {
        "id": 2000,
        "parent_id": null,
        "client_id": 49507,
        "currency": "EUR",
        "value": "23.00",
        "created_at": "2024-01-17 10:05:07",
        "updated_at": null,
        "primary_key_identifier": "2000"
    },...
]

macropay-solutions avatar Apr 23 '25 16:04 macropay-solutions

I think you're right, it might be something here 🙂 IMO, the flush is missing (same for StreamedResponse::setChunks()

mtarld avatar Apr 23 '25 17:04 mtarld

@mtarld Nope. We tried also with flush and ob_flush. If you read through the related issue from laravel, you will see why.

macropay-solutions avatar Apr 23 '25 17:04 macropay-solutions

@mtarld

If you're in that second case, you might want to take a look at the upcoming and experimental JsonStreamer component, which will be released in Symfony 7.3

Is JS capable of decoding partial json strings if the json is sent like:

[
    {
        "id": 2000,
        "parent_id": null,
        "client_id": 49507,
        "currency": "EUR",
        "value": "23.00",
        "created_at": "2024-01-17 10:05:07",
        "updated_at": null,
        "primary_key_identifier": "2000"
    },

then

    {
        "id": 2001,
        "parent_id": null,
        "client_id": 49507,
        "currency": "EUR",
        "value": "23.00",
        "created_at": "2024-01-17 10:05:07",
        "updated_at": null,
        "primary_key_identifier": "2001"
    },

then

    {
        "id": 2002,
        "parent_id": null,
        "client_id": 49507,
        "currency": "EUR",
        "value": "23.00",
        "created_at": "2024-01-17 10:05:07",
        "updated_at": null,
        "primary_key_identifier": "2002"
    }
]

?

macropay-solutions avatar Apr 23 '25 17:04 macropay-solutions

But on our implementation that echoes a json on a new row, and after each echo, we ob_flush and flush, in 20k rows about 300-400 rowa are split while sent, so, the above question about js becomes:

Is JS capable of decoding partial json strings if the json is sent like:

[
    {
        "id": 2000,
        "parent_id": null,
        "client_id": 49507,
        "currency": "EUR",
        "value": "23.00",
        "created_at": "2024-01-17 10:05:07",
        "updated_at": null,
        "primary_key_identifier": "2000"

then

    },
    {
        "id": 2001,
        "parent_id": null,
        "client_id": 49507,
        "currency": "EUR",
        "value": "23.00",
        "created_at": "2024-01-17 10:05:07",

then

        "updated_at": null,
        "primary_key_identifier": "2001"
    },
    {
        "id": 2002,
        "parent_id": null,
        "client_id": 49507,
        "currency": "EUR",
        "value": "23.00",
        "created_at": "2024-01-17 10:05:07",
        "updated_at": null,
        "primary_key_identifier": "2002"
    }
]

?

macropay-solutions avatar Apr 23 '25 17:04 macropay-solutions

I'll try to dig it out soon, can you in the meantime create and share a reproducer so we are sure we are talking about the same thing?

mtarld avatar Apr 23 '25 17:04 mtarld

We have little experience with symfony, this is a laravel example:


return new \Symfony\Component\HttpFoundation\StreamedJsonResponse(
    Operation::query()->with('client')->lazyByIdDesc(1000, 'id'),
     200,
     [],
     JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES
);

Can be seen in action here https://laravel-crud-wizard.com/laravel-10/laravel-lumen-crud-wizard#operations

Image

Put in textarea

limit=-1

and submit => a prompt to download the json file will appear with a delay (until it builds it in memory). The default condition is id < 20000 to not give memory error.

To compare with less rows in response use in textarea:

limit=-1&created_at[from]=2020-04-23 00:00:00&updated_at[o]=isNotNull

This will be almost instant returning

[
    {
        "id": 9,
        "parent_id": 3,
        "client_id": 177601,
        "currency": "EUR",
        "value": "75.00",
        "created_at": "2024-01-17 10:05:04",
        "updated_at": "2025-04-17 09:06:48",
        "primary_key_identifier": "9"
    },
    {
        "id": 8,
        "parent_id": null,
        "client_id": 45015,
        "currency": "EUR",
        "value": "43.00",
        "created_at": "2024-01-17 10:05:04",
        "updated_at": "2025-04-17 09:15:20",
        "primary_key_identifier": "8"
    },
    {
        "id": 7,
        "parent_id": 3,
        "client_id": 126362,
        "currency": "EUR",
        "value": "95.00",
        "created_at": "2024-01-17 10:05:04",
        "updated_at": "2025-04-16 16:39:25",
        "primary_key_identifier": "7"
    },
    {
        "id": 6,
        "parent_id": 2,
        "client_id": 84224,
        "currency": "EUR",
        "value": "97.00",
        "created_at": "2024-01-17 10:05:04",
        "updated_at": "2025-04-16 13:47:33",
        "primary_key_identifier": "6"
    },
    {
        "id": 5,
        "parent_id": 2,
        "client_id": 97396,
        "currency": "EUR",
        "value": "79.00",
        "created_at": "2024-01-17 10:05:04",
        "updated_at": "2025-04-16 15:32:53",
        "primary_key_identifier": "5"
    },
    {
        "id": 4,
        "parent_id": 2,
        "client_id": 165915,
        "currency": "EUR",
        "value": "89.00",
        "created_at": "2024-01-17 10:05:04",
        "updated_at": "2025-04-16 13:33:50",
        "primary_key_identifier": "4"
    },
    {
        "id": 1,
        "parent_id": null,
        "client_id": 1,
        "currency": "EUR",
        "value": "10.00",
        "created_at": "2024-01-03 19:39:02",
        "updated_at": "2025-04-16 13:51:10",
        "primary_key_identifier": "1"
    }
]

For normal json response (and executed queries) for the above last situation use:

limit=0&created_at[from]=2020-04-23 00:00:00&updated_at[o]=isNotNull&id[to]=20000

to see only the count

or

limit=10&created_at[from]=2020-04-23 00:00:00&updated_at[o]=isNotNull&id[to]=20000

to see all in normal json response.

macropay-solutions avatar Apr 23 '25 18:04 macropay-solutions

As stated here, flushing is your responsibility. Therefore, have you tried something like:

$iterateAndFlush = static function (iterable $iterable): iterable
{
    $i = 0;
    foreach ($iterable as $item) {
        yield $item;

        if ($i && $i % 200 === 0) {
            ob_flush();
            flush();
        }

        ++$i;
    }
}

return new \Symfony\Component\HttpFoundation\StreamedJsonResponse(
    $iterateAndFlush(Operation::query()->with('client')->lazyByIdDesc(1000, 'id')),
     200,
     [],
     JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES
)

?

mtarld avatar Apr 24 '25 07:04 mtarld

@mtarld https://github.com/symfony/symfony/issues/60257#issuecomment-2825000213

@mtarld Nope. We tried also with flush and ob_flush. If you read through the related issue from laravel, you will see why.

macropay-solutions avatar Apr 24 '25 07:04 macropay-solutions

Can you share me that piece of code? Maybe something going on here

mtarld avatar Apr 24 '25 07:04 mtarld

We edited the local vendor file at this line putting the ob_flush and flush https://github.com/laravel/framework/blob/44e6a294e4441e9e3338008af0288979b3f677e8/src/Illuminate/Database/Concerns/BuildsQueries.php#L342

macropay-solutions avatar Apr 24 '25 07:04 macropay-solutions

We added ob_flush();flush(); in the demo page at that line also now.

yield $result;ob_flush();flush();

macropay-solutions avatar Apr 24 '25 08:04 macropay-solutions

@mtarld the whole idea of streaming a json was started wrong. JS can't decode it until it receives it all. So even if it would work (to send chunks of json without using json_encode) like we mentioned in this comment, JS can't decode it until it receives it all, making the stream useless from javascript's point of view.

From BE point of view you avoid memory issues if it would work and that would be all the advantage to it...

Our solution (demo) sends each row from DB as a separate json on a new line, making it possible for JS to decode and display it immediately without waiting for the whole stream to end.

Dumb example:

If you want to see a movie. Current StreamedJsonResponse implementation as a logic, limits you from FE (and BE as it is coded now) to view it until you download it all.

macropay-solutions avatar Apr 24 '25 10:04 macropay-solutions

JS can't decode it until it receives it all, making the stream useless from javascript's point of view.

Not all API consumers are implemented in Javascript. Some other languages have streaming decoders for JSON (and there might be userland streaming decoders in JS as well btw)

stof avatar Apr 24 '25 10:04 stof

Our solution (demo) sends each row from DB as a separate json on a new line

this is not a JSON response then, but a JSONND response, which is a different content type.

stof avatar Apr 24 '25 10:04 stof

Btw, Symfony already supports returning JSONND responses in a streaming way (since years) by using the StreamedResponse (as you can use a normal json_encode for each line of the JSONND response)

stof avatar Apr 24 '25 11:04 stof

@stof Our solution extends StreamedResponse but we don't put types on keys like

{ 
  "name:string": "Alice", 
  "isActive:boolean": 0,
  "amountPaid:currency": 20 
}

So, it is not JSONND.

    {"id":17009,"value":"92.00","created_at":"2024-01-17 09:17:11","updated_at":null,"primary_key_identifier":"17009"}
    {"id":17008,"value":"87.00","created_at":"2024-01-17 09:17:11","updated_at":null,"primary_key_identifier":"17008"}

UPDATE

Acc to https://en.m.wikipedia.org/wiki/JSON_streaming

Newline-delimited JSON Two terms for equivalent formats of line-delimited JSON are: -Newline delimited (NDJSON) - The old name was Line delimited JSON (LDJSON). -JSON lines (JSONL), that is the current (2025) and most used standard, in Big Data and other applications.

Acc to https://github.com/glenkleidon/JSON-ND

JSON-ND or JSON with Named Datatypes

Based on this, we could use these MIME types: application/jsonl, application/x-ndjson, and application/x-jsonlines but because none are standardized, we left it as application/json https://jsonlines.org/

macropay-solutions avatar Apr 24 '25 11:04 macropay-solutions

@mtarld we deployed your changes from here in our demo page. Still memory error for 1 mil rows.

[2025-04-26 14:12:23] test.DEBUG: new change blade 1000000  
[2025-04-26 14:12:23] test.DEBUG: new change  StreamedJsonResponse
[2025-04-26 14:12:53] test.ERROR: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 16777224 bytes) {"exception":"[object] (Symfony\\Component\\ErrorHandler\\Error\\FatalError(code: 0): Allowed memory size of 1073741824 bytes exhausted (tried to allocate 16777224 bytes) at /home/.../laravel-10/vendor/laravel/framework/src/Illuminate/Collections/LazyCollection.php:106)
[stacktrace]
#0 {main}
"} 

We rolled it back to 20000 rows from 1 mil and to the original StreamedJsonResponse.

This is how we call it

return new \Symfony\Component\HttpFoundation\StreamedJsonResponse($lazyCollection, 200, [], JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES);

macropay-solutions avatar Apr 26 '25 14:04 macropay-solutions

I am afraid that we will not be able to help you without an application that easily allows to reproduce your issue. Otherwise you will probably have to debug a possible solution yourself.

xabbuh avatar Apr 26 '25 14:04 xabbuh

@xabbuh Thank you. The situation is the other way around. We offered symfony our help by rising this issue and providing an alternative.

The docs could be changed to point this out.

macropay-solutions avatar Apr 26 '25 14:04 macropay-solutions

Reporting a bug is helpful for the project for sure, yet the bug reported here depends on multiple factors including ones that are out of Symfony's control (LazyCollection from Laravel and your own code). Hence we ask you to provide code one can run to reproduce the bug, in order to make your report even more helpful as it'd make us' able to confirm the bug and fix it eventually.

chalasr avatar Apr 26 '25 14:04 chalasr

@chalasr we don't have a symfony dev to create a symfony demo project that reproduces this atm.

macropay-solutions avatar Apr 27 '25 06:04 macropay-solutions

Note that any php project would work, even a laravel one or a vanilla script.

chalasr avatar Apr 27 '25 20:04 chalasr

@chalasr you can use this then https://github.com/macropay-solutions/laravel-crud-wizard-decorator-free-demo/tree/StreamedJsonResponse_not_streaming

create .env from .env example

run:

composer install php artisan key:generate php artisan migrate php artisan db:seed --class=DatabaseSeeder

Call

GET /api/operations?limit=-1

macropay-solutions avatar Apr 28 '25 04:04 macropay-solutions

We will remove the not streaming demo from our laravel 10 demo project now.

macropay-solutions avatar May 07 '25 10:05 macropay-solutions

Cloned 👍

chalasr avatar May 07 '25 10:05 chalasr

@chalasr The demo project branch for laravel 12 is not affected, you can continue using it, only the demo webpage for laravel 10 we aligned with the laravel 9 demo version so, now both use streamed json lines.

Update One of our team members received error on ob_flush locally when testing and that explains the @ob_flush used here. But while using docker or shared hosting the error does not occur.

macropay-solutions avatar May 07 '25 10:05 macropay-solutions

not directly related to this issue but @macropay-solutions why would u use json for 1 million records?

json is really bad for bulk data, the keys are repeated in each object ... csv is better for bulk data (if we're just sticking to text formats)

faizanakram99 avatar May 24 '25 18:05 faizanakram99

@faizanakram99 We also have csv download that streams ok, but you can stream only 2D data as CSV. Json lines stream can handle more complex data structures. An example would be a nested json:

{"a": 1, "b": {"c": 2}}

Elastic search responses work like a CSV. The columns are sent only in 1st page and then only list of values are returned. In FE that is hard to work with.

What led to this issue was our desire to have a way of sending ALL without memory issues.

macropay-solutions avatar May 24 '25 18:05 macropay-solutions