cog icon indicating copy to clipboard operation
cog copied to clipboard

Let model authors specify filetypes for inputs and outputs (audio, video, image, etc)

Open zeke opened this issue 2 years ago • 2 comments

The cog.Path object is used to get files in and out of models. It represents a path to a file on disk. Path is used for all files, regardless of whether they're text files, zip files, videos, images, audio files, etc.

What kind of file does the model want? 🤷🏼

When looking at the schema for a model, it's not easy to tell what type of file is expected:

$ curl -s -H "Authorization: Token $REPLICATE_API_TOKEN" \
  https://api.replicate.com/v1/models/stability-ai/sdxl | jq ".latest_version.openapi_schema.components.schemas.Input.properties.mask"

SDXL's mask input expects an image file, but that's not clear from the schema. Unless the model author writes a description that says what kind of file is expected, users of the model can't reliably know what's expected:

{
  "type": "string",
  "title": "Mask",
  "format": "uri",
  "x-order": 3,
  "description": "Input mask for inpaint mode. Black areas will be preserved, white areas will be inpainted."
}

Being explicit about file types

What if, instead of defining the mask in the predictor as a Path, it could be an ImagePath, which would really just be a Path under the hood with some extra constraints?

from cog import BasePredictor, Input, ImagePath

class Predictor(BasePredictor):
    def predict(
        self,
        mask: ImagePath = Input(
            description="Input mask for inpaint mode. Black areas will be preserved, white areas will be inpainted.",
            default=None,
        )
    )

This may be a naive suggestion about how to approach making input and output types more apparent to model consumers, but I'm open to other ideas that address the issue.

Related issues:

  • https://github.com/replicate/cog/issues/496

zeke avatar Oct 18 '23 22:10 zeke

Maybe it could be a property of the existing Path, like a list of mimetypes or something.

zeke avatar Nov 16 '23 17:11 zeke

Related: https://github.com/replicate/cog/pull/2014

zeke avatar Oct 23 '24 15:10 zeke