fastapi icon indicating copy to clipboard operation
fastapi copied to clipboard

[QUESTION] Better way to send an in-memory file?

Open cgarciae opened this issue 4 years ago • 7 comments

I have pandas dataframe I want to send as a csv file to the client. I am currently doing this:

    file = StringIO()
    df_solution.to_csv(file)

    return StreamingResponse(
        iter([file.getvalue()]),
        ...
    )

This works but I don't know if its the bast way. Intuitively you might be able to send the buffer directly, or at least avoid creating a single value iterator from a list. FileResponse asks for a path so I guess its not what I want. Not a pressing issue but its not obvious what the best solution is in this case.

cgarciae avatar Apr 16 '20 05:04 cgarciae

For a small file it's probably fine, if you want to make sure you aren't blocking you can provide an async generator to StreamingResponse

chris-allnutt avatar Apr 22 '20 01:04 chris-allnutt

I have a similar use case with serving very small image files that are pulled from a database. I don't want to have to write to the file system so I don't think I can use FileResponse. I planned to do something very similar to this but with BinaryIO.

Kilo59 avatar Jul 09 '20 15:07 Kilo59

Here is small snippet of my code. I am also wondering if there is a better way.

csv generator

def newcsv(data, csvheader, fieldnames):
    """
    Create a new csv file that represents generated data.
    """
    new_csvfile = StringIO()
    wr = csv.writer(new_csvfile, quoting=csv.QUOTE_NONNUMERIC)
    wr.writerow(csvheader)
    wr = csv.DictWriter(new_csvfile, fieldnames=fieldnames)

    wr.writerows(data)
    new_csvfile.seek(0)  # need to seek first posittion for readline() to return something
    line = new_csvfile.readline()
    while len(line) > 0:
        yield line
        line = new_csvfile.readline()
    new_csvfile.close()

StreamingResponse

return StreamingResponse(csv, media_type="text/csv", headers={'Content-Disposition': 'filename=generated.csv'})

igorovic avatar Dec 04 '20 06:12 igorovic

If you want to generate the csv string with pandas upfront and then send it in the response (i.e. non-async), you can use PlainTextResponse which should also perform better than the StreamingResponse with a normal iterator.

return PlainTextResponse(csv_str, media_type="text/csv")

scriptator avatar May 07 '21 06:05 scriptator

I have used the below snippet in situations where my files are in-memory (BytesIO) but want to send the file as an attachment that can be downlaoded

  response = StreamingResponse(buffer, media_type="application/zip")
  response.headers["Content-Disposition"] = "attachment; filename=images.zip"

  return response

rehanhaider avatar Jun 12 '21 19:06 rehanhaider

In my opinion the problem with writing a StringIO file is that we are still loading all of the data in memory and we actually generate a whole file before response begins.

So in cases where we can process our file "line by line" we can save memory, do a generator (or even better - async generator), and gave it to StreamingResponse. CSV files are good example.

Django documentation has an interesting example of streaming big csv files. I'm going to apply this idea to my FastAPI endpoint. Streaming large CSV files

wwarne avatar Jun 25 '21 10:06 wwarne

So in cases where we can process our file "line by line" we can save memory, do a generator (or even better - async generator), and gave it to StreamingResponse. CSV files are good example.

100% agree. If you can truly stream the response, please do so. However, if your data source is e.g. a pandas dataframe that you have in memory already and need to export to CSV I found that it is faster to use the PlainTextResponse.

scriptator avatar Jul 05 '21 15:07 scriptator