httpx icon indicating copy to clipboard operation
httpx copied to clipboard

Support async file types in `files = {}` and `content = ...`

Open lovelydinosaur opened this issue 4 years ago • 27 comments

We ought to support the following cases.

Raw upload content from an async file interface:

import httpx
import trio

async def main():
    async with httpx.AsyncClient() as client:
        async with await trio.open_file(...) as f:
            client.post("https://www.example.com", content=f)

trio.run(main)

Multipart file upload from an async file interface:

import httpx
import trio

async def main():
    async with httpx.AsyncClient() as client:
        async with await trio.open_file(...) as f:
            client.post("https://www.example.com", file={"upload": f})

trio.run(main)

We probably want to ensure that we're supporting both trio, anyio (Which have the same interfaces), and perhaps also `aiofiles. So eg, also supporting the following...

# Supporting the same as above but using `asyncio`, with `anyio` for the file operations.
import anyio
import asyncio
import httpx

async def main():
    async with httpx.AsyncClient() as client:
        async with await anyio.open_file(...) as f:
            client.post("https://www.example.com", content=f)

asyncio.run(main())

The content=... case is a little simpler than the data=... case, since it really just need an async variant of peek_filelike_length, and a minor update to the ._content.encode_content() function.

Also fiddly is what the type annotations ought to look like.

lovelydinosaur avatar Apr 30 '21 09:04 lovelydinosaur

Hi, I'm interested in this issue and also found it in the ._content.encode_content() function, the first thing need to do is to find the type of trio and anyio. Could I have a try for getting it done? :D

Mayst0731 avatar Jun 03 '21 21:06 Mayst0731

Ohhh, like you said its multipart issue seems like aiohttp working with multipart as well.
("https://docs.aiohttp.org/en/stable/multipart.html")

Mayst0731 avatar Jun 03 '21 22:06 Mayst0731

hey, @meist0731 are you still working on this issue ? If yes, can we work it together. Seems like an interesting problem

ajayd-san avatar Jun 11 '21 06:06 ajayd-san

hey, @meist0731 are you still working on this issue ? If yes, can we work it together. Seems like an interesting problem

Yep! I'm still working on this. It's my honor to work together with you :DDD I'm going to sleep soon and tomorrow I will share the materials I've searched before.

Mayst0731 avatar Jun 11 '21 06:06 Mayst0731

@meist0731 cheers, you on discord ? itll be easier to work together.

ajayd-san avatar Jun 11 '21 08:06 ajayd-san

@meist0731 cheers, you on discord ? itll be easier to work together. my id - Krunchy_Almond#2794

Gotcha! I have discord, wait a sec, bro.

Mayst0731 avatar Jun 11 '21 21:06 Mayst0731

@meist0731 cheers, you on discord ? itll be easier to work together. my id - Krunchy_Almond#2794

Hey, I've sent the invitation :D

Mayst0731 avatar Jun 11 '21 22:06 Mayst0731

@tomchristie how do you recommend to proceed this issue? Like can you explain where to start and all ?

ajayd-san avatar Jun 21 '21 05:06 ajayd-san

I've tried these APIs as below,

async def main1():
    async with await anyio.open_file('./content.txt','rb') as f:
        await client.post("https://www.example.com", content=f)
anyio.run(main1)
async def main2():
    async with await trio.Path('./content.txt').open('rb') as f:
            await client.post("https://www.example.com", content=f)
trio.run(main2)
async def main3():
    async with aiofiles.open('./content.txt', mode='rb') as f:
        await client.post("https://www.example.com", content=f)      
asyncio.run(main3())

The multipart upload:

async def main5():
    async with httpx.AsyncClient() as client:
        async with await anyio.open_file('./content.txt','rb') as f:
            await client.post("https://www.example.com", files={"upload": f})
anyio.run(main5)

The problems here are: (1) The above functions only support read files in "rb" mode instead of 'r', otherwise it will give a TypeError saying "sequence item 1: expected a bytes-like object, str found", but I haven't figure out which part of code is handling this. (2) I've tested the above functions with a text file, it proved that no matter reading a file in a sync way or async way using trio, anyio or aiofiles, the .peek_filelike_length function can get the file's length correctly. However, when it comes to multipart upload, the error shows "AsyncIOWrapper"/"AsyncFile"/"AsyncBufferedReader" (Asynciterable object) is not iterable. It seems because the iteration functions here are all sync functions instead of async functions cannot support iterate an Asynciterable object but still receive Asynciterable objects, except for the final one is an async function

For example, this function accepts AsyncIterable objects but cannot perform iterations indeed. is not an async function.

Mayst0731 avatar Jun 24 '21 05:06 Mayst0731

The first problem has an existing discussion https://github.com/encode/httpx/discussions/1704#discussion-3421862

Mayst0731 avatar Jun 24 '21 18:06 Mayst0731

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Feb 20 '22 15:02 stale[bot]

Argh, stale, wontfix, nooo 😱 !

Just want to make sure: uploading a file and data as multipart is still not supported by the async client, right? I'm getting the Attempted to send an sync request with an AsyncClient instance. error message when trying to do such a thing.

import httpx
from aiofiles import open as aopen
async with aopen("somefile.zip", "rb") as fp, httpx.AsyncClient() as client:
    files = {"content": ("somefile.zip", fp, "application/octet-stream")}
    response = await client.post(
        "http://localhost:8888",
        data=data_to_send,
        files=files,
        follow_redirects=False,
        headers={"Content-Type": f"multipart/form-data; boundary={uuid4().hex}"},
    )

pawamoy avatar Jun 16 '22 12:06 pawamoy

@pawamoy No fix was implemented AFAIK. This seems like an issue stalebot closed due to lack of inactivity, rather than us deciding it shouldn't be acted upon. I guess we can reopen (stalebot would come back in a few months) and any attempts towards supporting the interfaces described in OP (trio, anyio, aiofiles) would be welcome!

florimondmanca avatar Jun 16 '22 19:06 florimondmanca

While the issue is not resolved, I'm using following monkey-patch, maybe it will be helpful:

"""
This is workaround monkey-patch for https://github.com/encode/httpx/issues/1620

If you need to upload async stream as a multipart `files` argument, you need to apply this patch
and wrap stream with `AsyncStreamWrapper`::

    httpx_monkeypatch.apply()
    ...

    known_size = 42
    stream = await get_async_bytes_iterator_somehow_with_known_size(known_size)
    await client.post(
        'https://www.example.com',
        files={'upload': AsyncStreamWrapper(stream, known_size)})
    )
"""
import typing as t
from asyncio import StreamReader

from httpx import _content
from httpx._multipart import FileField
from httpx._multipart import MultipartStream
from httpx._types import RequestFiles


class AsyncStreamWrapper:
    def __init__(self, stream: t.Union[t.AsyncIterator[bytes], StreamReader], size: int):
        self.stream = stream
        self.size = size


class AsyncAwareMultipartStream(MultipartStream):

    def __init__(self, data: dict, files: RequestFiles, boundary: bytes = None) -> None:
        super().__init__(data, files, boundary)
        for field in self.fields:
            if isinstance(field, FileField) and isinstance(field.file, AsyncStreamWrapper):
                field.get_length = lambda f=field: len(f.render_headers()) + f.file.size  # type: ignore # noqa: E501

    async def __aiter__(self) -> t.AsyncIterator[bytes]:
        for field in self.fields:
            yield b'--%s\r\n' % self.boundary
            if isinstance(field, FileField) and isinstance(field.file, AsyncStreamWrapper):
                yield field.render_headers()
                async for chunk in field.file.stream:
                    yield chunk
            else:
                for chunk in field.render():
                    yield chunk
            yield b'\r\n'
        yield b'--%s--\r\n' % self.boundary


def apply():
    _content.MultipartStream = AsyncAwareMultipartStream

reclosedev avatar Dec 12 '22 13:12 reclosedev

Has there been any progress on this by any chance?

and3rson avatar Apr 16 '23 18:04 and3rson

I am also using files={"upload": f} where f is a multi-part async file upload from FastAPI.

It says TypeError: object of type 'coroutine' has no len() which kills me. The file is quite large I hope it gets handled in a streamable way.

lambdaq avatar May 19 '23 11:05 lambdaq

If anyone is invested in making this happen I can make the time to guide a pull request through.

lovelydinosaur avatar May 19 '23 12:05 lovelydinosaur

I am also using files={"upload": f} where f is a multi-part async file upload from FastAPI.

I solved this problem, for FastAPI. When reading a uploaded file in a form, FastAPI wraps a SpooledTemporaryFile into async style.

To access the file with httpx, the async doesn't fit, but you can use the old-fasioned way, just change

httpx.post(..., files={"upload": f})

into

httpx.post(..., files={"upload": f.file})

lambdaq avatar Jun 07 '23 02:06 lambdaq

A monkey patch showing a possible solution (also cancels #1706 and covers #2399):

https://gist.github.com/yayahuman/db06718ffdf8a9b66e133e29d7d7965f

And possible type annotations:

from abc import abstractmethod
from typing import AnyStr, AsyncIterable, Iterable, Protocol, Union  # 3.8+


class Reader(Protocol[AnyStr]):
    __slots__ = ()
    
    @abstractmethod
    def read(self, size: int = -1) -> AnyStr:
        raise NotImplementedError


class AsyncReader(Protocol[AnyStr]):
    __slots__ = ()
    
    @abstractmethod
    async def read(self, size: int = -1) -> AnyStr:
        raise NotImplementedError


FileContent = Union[
    str,
    bytes,
    Iterable[str],
    Iterable[bytes],
    AsyncIterable[str],
    AsyncIterable[bytes],
    Reader[str],
    Reader[bytes],
    AsyncReader[str],
    AsyncReader[bytes],
]

RequestContent = FileContent

yayahuman avatar Jul 06 '23 00:07 yayahuman

@tomchristie, can my monkey patch approach be acceptable?

yayahuman avatar Jul 10 '23 22:07 yayahuman

Let me help guide this conversation a bit more clearly. I would probably suggest starting by just looking at the content=... case. A good starting point for a pull request would be a test case for that one case, which demonstrates the behaviour we'd like to see.

lovelydinosaur avatar Jul 11 '23 09:07 lovelydinosaur