scriban icon indicating copy to clipboard operation
scriban copied to clipboard

Rendering IAsyncEnumerable

Open Tyrrrz opened this issue 5 years ago • 7 comments

Hi there. I've been using Scriban for quite some time now and I've recently met myself with requirement to render a rather large stream of data using a Scriban template. Is it possible to do with IAsyncEnumerable? The underlying data is too huge to hold in RAM so it's a necessity.

Tyrrrz avatar Dec 03 '19 23:12 Tyrrrz

Currently I'm handling this in my project by splitting the template into 3 parts:

  • static leading block
  • item block
  • static trailing block

I'm rendering the leading block first, then for each item in the async enumerable I render using the second part, and then finally finish up with the third part.

Tyrrrz avatar Dec 07 '19 19:12 Tyrrrz

I don't remember the details, could be possible, but I'm not sure. It requires definitely to modify Scriban, If you have time to experiment it and try it yourself, let me know.

xoofx avatar Jan 26 '20 07:01 xoofx

I would also like to make use of streams as we have a large data set. Is this possible today?

gorillapower avatar Feb 14 '22 18:02 gorillapower

I would also like to make use of streams as we have a large data set. Is this possible today?

You can always iterate on large data set with a regular IEnumerable. It's not like AsyncEnumerable is a condition to scale, or is going to change a world specially if you have one app/thread processing the input data.

xoofx avatar Feb 15 '22 06:02 xoofx

Ah, well noted. I guess my thinking was to try and have the engine read the input parameters as stream data and then have it return the result as a readable stream. My input could be very large and potentially would create a large template and so I would like to have the ability to stream the result to a file or similar destination. I understand having end to end streaming capabilities is perhaps not a common requirement however.

gorillapower avatar Feb 15 '22 08:02 gorillapower

Ah, well noted. I guess my thinking was to try and have the engine read the input parameters as stream data and then have it return the result as a readable stream. My input could be very large and potentially would create a large template and so I would like to have the ability to stream the result to a file or similar destination. I understand having end to end streaming capabilities is perhaps not a common requirement however.

You can also stream the output as well with Scriban, see #54

xoofx avatar Feb 15 '22 08:02 xoofx

I having success with the output streams, thank you for the direction.

However, for the input model, I have a property body that returns an IEnumerable<object> using yield return. Notice that the source is a Stream.

Method

private static IEnumerable<object> GetJsonObjectEnumerable(Stream inputStream)
{
    JsonSerializer serializer = new JsonSerializer();

    using (StreamReader sr = new StreamReader(inputStream))
    using (JsonReader reader = new JsonTextReader(sr))
    {
        while (reader.Read())
        {
            if(reader.TokenType == JsonToken.StartObject)
            {
                var foo1= serializer.Deserialize(reader);
                yield return foo1;
            }
        }
    }
}

Execution

var model = new
{
    response = new 
    {
        body = GetJsonObjectEnumerable(streamInput),
        status = 200
    }
};
            

string templateString = @"[{% for product in response.body %}
                            ""{{ product.id }}"",
                        {% endfor %}]";

using (var fileStream = new FileStream("c:/temp/test.json",FileMode.Create))
using (var textWriter = new StreamWriter(fileStream))
{
    //Model
    ScriptObject scriptObject1 = new ScriptObject();
    scriptObject1.Import(model);

    //Template with custom textwriter output
    LiquidTemplateContext context = new LiquidTemplateContext();
    context.PushOutput(new TextWriterOutput(textWriter));
    context.PushGlobal(scriptObject1);
    context.LoopLimit = 0;

    //evaluate template
    Template template = Template.ParseLiquid(templateString);
    await context.EvaluateAsync(template.Page);
}

After the evaluation step, the the entire IEnumerable is itereated over and loaded into memory before the template is executed (in my example it consumes 1.5gb memory)

Do I need to define a custom IEnumerator/IEnumerable so that the internal code does not try iterate over the whole list? Im looking at this file https://github.com/scriban/scriban/blob/master/src/Scriban/Runtime/Accessors/ListAccessor.cs and thinking that the .Count could trigger a scan.

gorillapower avatar Feb 15 '22 14:02 gorillapower