stable-diffusion-webui PoC API

PoC API

Open TomJamesPearce opened this issue 1 year ago • 49 comments

This is a PoC API file to serve as a template for other contributions/get a discussion started. It uses Fast API This only implements txt2img.

There are a number of things that need to be addressed for a merge.

[ ] Implement other endpoints/scripts
[ ] Requests should be queued, at the moment they are called async, likely to cause OOM errors.
[ ] At the moment, the last few params of txt2img are hard coded, because I haven't figured out what they are yet.
[x] Merge requirements somehow.
[x] Proper .bat and .sh runfiles, at the moment, I run webui.bat|sh to build the base venv, then install the extra requirements into the venv and run apy.py with that venv.
[x] Add params around port/host publishing, at the moment it listens to all devices by default. Not the safest default.

Once launched, http://localhost:8080/docs shows example usage. This can be extended further with more doco.

Here's an example script to hit the API once it's running.

import io
import requests
from PIL import Image
import base64

params = {
    "txt2img": {
        "prompt": "A happy doggo",
        "steps": 15,
        "batch_size": 1,
    }
}

resp = requests.get(url="http://localhost:8080/txt2img/", json=params).json()

for i in resp['images']:
    img = Image.open(io.BytesIO(base64.b64decode(i)))
    img.show()

Sep 20 '22 18:09 TomJamesPearce

Well, this is pretty raw just as you wrote. If you're going to continue to work on this, I'd like to not have another file with copies requirements - only list those needed for the api (and preferably without requirements of requirements). The functions you copied from webui should not be copies and instead should be imported. Considering how tightly gradio is integrated, you're not getting getting rid of gradio, and if you are not, we can just launch api from webui.py depending on commandline flags. The scary TextToImageResponse constructyor near the end hopefully should be called using kwargs without enumerating all fields.

Sep 20 '22 19:09 AUTOMATIC1111

Happy to contribute yes.

we can just launch api from webui.py depending on commandline flags

To clarify, you see this working something like:

if __name__ == "__main__":
    if !cmd_opts.api:
        webui()
    else:
        api()

Sep 20 '22 21:09 TomJamesPearce

Pushed an update, the example API is now integrated into webui.py, added a commandline flag to switch between the webui and api.

Added the requirements to the default requirements files.

Now defaults to localhost, also allows switching of port.

Given how bulky the definitions for the API params/return classes are, it might be worth declaring them somewhere else, and importing them.

Sep 21 '22 00:09 TomJamesPearce

Just some words of encouragement - this is a great idea and looking forward to seeing this in main

Sep 21 '22 02:09 Oceanswave

The bulk of API should be in its own file, something like modules/api.py, with just one function called from webui.py

Sep 21 '22 06:09 AUTOMATIC1111

Woot! This would be great. I'm currently using an ugly, brittle, hacky way to use the current system as an API (by faking gradio-type requests) for my custom native client but this would be so much better.

Sep 21 '22 06:09 SethRobinson

I'd love to work on a custom UI if we had an API, so I'm fully in support of this. Could we also serve up the resulting images as URLs and use FastAPI's static file serving to facilitate that? I think I'm noticing performance issues in the UI due to the massive data-uris.

Sep 21 '22 10:09 overra

Would also be nice to have API versioning. (Like the client could send the requested version of the api in requests (ie, first version is 1 or whatever), to gracefully add changes/additions later without breaking old clients)

Sep 21 '22 13:09 SethRobinson

I love this idea, my only concern the difficulty involved with keeping the API up to date when the webui and tooling in general is continuing to evolve so rapidly. I'm probably just getting ahead of myself though... A basic API that doesn't cover everything is definitely better than no API.

Sep 21 '22 16:09 JustMaier

Thanks for the kind encouragement and input all.

I've added api.py to /modules/ and stripped pretty much everything out of webui.py, except for a quick import, init and run.

I still need to implement queueing of requests, the other endpoints, and figure out what the other params are at the end of my call to txt2img (if anyone could fill me in so I can just add them to the API definitions I'd appreciate it).

Sep 21 '22 17:09 TomJamesPearce

see my previous request: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/65 You can make just a fork and your own api.py as I did. But it would be very helpful not been forced to index parameter list in the main functions img2img, img2txt (CSV again) because of that *args the end named arguments are not possible. Better use kwargs. Thank you!

Sep 22 '22 11:09 imperator-maximus

Come on now, imperator maxmus, you should be able to solve this. args is there for a reason, if I could use kwards, I would.

Sep 22 '22 13:09 AUTOMATIC1111

every time I get an update it is a lot of fun to manage it - yes :) But better get rid of this Gradio stuff in future - it blocks development in this place. I can really understand that you do not need to manage UI work which save a lot of time but in long-term it is too limited anyway. Or the gradio team will improve this - that would be also a solution.

Sep 22 '22 13:09 imperator-maximus

Gradio is not perfect but it's not bad enough to switch away from it.

Sep 22 '22 13:09 AUTOMATIC1111

imperitor, I agree that the function signature isn't ideal, but people are happily using Gradio.

If Gradio makes this easier on their side, or someone implements a UI that's competitive with Gradio (Which an API will let people experiment with), then I'd advocate for switching, until then we make do.

I'm hoping that I'll have some time this weekend to push out the last few things.

@AUTOMATIC1111 if I fix the queueing and implement the other endpoints is there anything else you'd like to see before a merge?

Sep 22 '22 17:09 TomJamesPearce

keep webui() function as it is, add a single api() (or name it however you want) to modules.api and do

if __name__ == "__main__":
    if cmd_opts.api:
         modules.api.api()
    else:
         webui()

I don't like big class definitions but looks like there's not much that can be done. Maybe I'll experiment with eliminating them later.

0, False, None, '', False, 1, '', 4, '', True) should be 0).

Other than that, all's good.

Sep 22 '22 20:09 AUTOMATIC1111

finally people of color are going to get some recognition

Sep 22 '22 20:09 AUTOMATIC1111

Sorry but, cant make it work with your sample code nor any other request library

this is request file.

And when request made, server return 422 Unprocessable Entity error

Sep 25 '22 07:09 ahgsql

Sorry but, cant make it work with your sample code nor any other request library

this is request file.

And when request made, server return 422 Unprocessable Entity error

Fixed it by adding methods=["POST"] in api.py #71

Sep 25 '22 11:09 ahgsql

@ahgsql I got it working using the GET call, but you have to pass the following as the body:

{
    "txt2imgreq":
    {
        "prompt": "A happy dog",
        "steps": 15,
        "batch_size": 1
    }
}

@TomJamesPearce Am I missing something integrating this API? I can use the standard Gradio generators with no issues, but using this API any hits to txt2img result in:

 File "D:\Dev\stable-diffusion-webui\modules\processing.py", line 437, in init
    scale = math.sqrt(desired_pixel_count / actual_pixel_count)
ZeroDivisionError: division by zero

EDIT: Stepping into the code, it looks like Width is assigned to false somehow? Manually forcing it looks like it starts to generate, but then explodes with noise = noise * sigmas[steps - t_enc - 1] IndexError: index -20 is out of bounds for dimension 0 with size 1.

Sep 25 '22 19:09 scpedicini

@ahgsql I got it working using the GET call, but you have to pass the following as the body:
{
    "txt2imgreq":
    {
        "prompt": "A happy dog",
        "steps": 15,
        "batch_size": 1
    }
}
@TomJamesPearce Am I missing something integrating this API? I can use the standard Gradio generators with no issues, but using this API any hits to txt2img result in:
 File "D:\Dev\stable-diffusion-webui\modules\processing.py", line 437, in init
    scale = math.sqrt(desired_pixel_count / actual_pixel_count)
ZeroDivisionError: division by zero
EDIT: Stepping into the code, it looks like Width is assigned to false somehow? Manually forcing it looks like it starts to generate, but then explodes with noise = noise * sigmas[steps - t_enc - 1] IndexError: index -20 is out of bounds for dimension 0 with size 1.

Yeah, now it worked with GET request, Any advice for getting noised images while its generating, I mean can we get every x steps of image generation phase's image?

Sep 25 '22 19:09 ahgsql

Baby steps. Let the guy finish what he planned then we will think about expanding the API.

Sep 25 '22 21:09 AUTOMATIC1111

Sounds good, just a quick general update. I got those errors when I tried to use @TomJamesPearce code with the latest commit in AUTOMATIC1111 - it works fine "out of box" on the forked branch however.

Sep 25 '22 23:09 scpedicini

@ahgsql I got it working using the GET call, but you have to pass the following as the body:
{
    "txt2imgreq":
    {
        "prompt": "A happy dog",
        "steps": 15,
        "batch_size": 1
    }
}
@TomJamesPearce Am I missing something integrating this API? I can use the standard Gradio generators with no issues, but using this API any hits to txt2img result in:
 File "D:\Dev\stable-diffusion-webui\modules\processing.py", line 437, in init
    scale = math.sqrt(desired_pixel_count / actual_pixel_count)
ZeroDivisionError: division by zero
EDIT: Stepping into the code, it looks like Width is assigned to false somehow? Manually forcing it looks like it starts to generate, but then explodes with noise = noise * sigmas[steps - t_enc - 1] IndexError: index -20 is out of bounds for dimension 0 with size 1.
Yeah, now it worked with GET request, Any advice for getting noised images while its generating, I mean can we get every x steps of image generation phase's image?

Because of the parameters first: api.py

insert seed_enable_extras: bool = Field(default=False, title="seed_enable_extras") after seed_resize_from_w: int = Field(default=0, title="Seed Resize From Width")

second: images.py line 298 convert job_timestamp to str

x = x.replace("[job_timestamp]", str(shared.state.job_timestamp))

Sep 26 '22 08:09 rocing

Yeah, sorry things broke. The API had some changes after I wrote the example script. If you run my branch and navigate to http://host:port/docs in the browser you'll see the current implementation of the API for any given commit.

Hopefully I'll have time in the next couple days to finish this off.

There may be other bugs with it, I've only really used it with a given set of paramaters tuned towards my usecase (~50k images so far...). Happy to accept merges if other people want to contribute on my working branch.

Sep 26 '22 16:09 TomJamesPearce

Just following this - would really come in handy especially if the queue is implemented. Especially if this repo gets multi gpu VRAM pooling or even just multi GPU workload splitting. Great stuff!

Oct 03 '22 20:10 huotarih

I just get a generic set of API calls if I visit http://host:port/docs - am I missing something obvious? There are API docs - but nothing specific to this project - just user/token etc. I tried the sample calls mentioned above and just got a detail: Not Found JSON response. My latest git pull was last night.

Do I need to enable the API or something? I'm looking to call it from another machine via the command line.

[Edit - okay perhaps this is only available on the https://github.com/TomJamesPearce/stable-diffusion-webui-api fork]

Oct 05 '22 14:10 olinorwell

I've added features to api.py on my fork:

Fixed compatibility with pre python 3.9, TomJamesPearce's initial api.py gave me "TypeError: 'type' object is not subscriptable" errors
Added img2img/inpainting, extras (upscaling stuff), and interrogator endpoints
--api command now runs in tandem with the web interface via mount_gradio_app (can use both interfaces at once. Note that when --api is used the gradio init is different and is missing the auth stuff though) (note that gradio 3.4 is required for this)
Uses post instead of get
Can specify sampler_name instead of a sampler_index number, likewise with things like inpainting_fill
added juptyer notebook file with API access tests

It now has everything I need for my frontend to use it instead of the horrible cursed gradio hack I was doing previously, a bit sloppy but maybe someone will find something useful in it

Oct 06 '22 06:10 SethRobinson

I've added features to api.py on my fork:

Fixed compatibility with pre python 3.9, TomJamesPearce's initial api.py gave me "TypeError: 'type' object is not subscriptable" errors

Added img2img/inpainting, extras (upscaling stuff), and interrogator endpoints

--api command now runs in tandem with the web interface via mount_gradio_app (can use both interfaces at once. Note that when --api is used the gradio init is different and is missing the auth stuff though) (note that gradio 3.4 is required for this)

Uses post instead of get

Can specify sampler_name instead of a sampler_index number, likewise with things like inpainting_fill

added juptyer notebook file with API access tests

It now has everything I need for my frontend to use it instead of the horrible cursed gradio hack I was doing previously, a bit sloppy but maybe someone will find something useful in it

getting this error in your fork

Oct 06 '22 20:10 ahgsql

Hmm. This is with the test in the notebook? When it has an error like this, it usually sends back something in the client response that might be a clue.

Try this from a shell on the same machine the server is running:

curl -X 'POST' 'http://localhost:7860/v1/txt2img' -H 'Content-Type: application/json' -d '{"txt2imgreq": {"prompt": "A happy doggo"} }'

This should either work (a HUGE json file returned) or the output returned will show the specific error.

Note that the URL cannot have a trailing slash (I don't think yours does, but figured I'd mention it anyway as some earlier examples in this thread did)

Oct 06 '22 21:10 SethRobinson

stable-diffusion-webui stable-diffusion-webui copied to clipboard

PoC API

stable-diffusion-webui
stable-diffusion-webui copied to clipboard