utoipa icon indicating copy to clipboard operation
utoipa copied to clipboard

The default / example schema for `Vec<u8>` is `"string"`

Open johnperry-math opened this issue 1 year ago • 10 comments

Version

latest

Description

If a field has type Vec<u8> then #[derive(ToSchema)] produces a description of "string" instead of an array of integers.

Expected behavior

The generated schema should be an array of integers.

MWE

Add the field icon: Vec<u8>, to struct Todo. Everything runs and compiles happily, but the generated schema is

icon*     string($binary)

and the example gives

    "icon": "string",

Workaround

Provide a default and example schema for icon as json!(vec![0_u8, 255]).

johnperry-math avatar Apr 10 '23 22:04 johnperry-math

This is actually by design because list of u8s is the way raw bytes are represented, so utoipa makes assumption that user want's to return binary data e.g. octet-stream.

We could add attribute to consider the type strictly as vec of numbers that users can add over their field e.g.

#[derive(ToSchema)]
struct Foo {
    #[schema(strict)]
    value: Vec<u8>,
}

juhaku avatar Apr 10 '23 23:04 juhaku

I don't know if we understand each other. I do in fact want binary data (vec of u8). The problem is that the schema represents it as a string.

If I ignore the first six words of your first sentence, it sounds like we have the same goal... but those first six words imply we don't. Can you elaborate?

johnperry-math avatar Apr 11 '23 14:04 johnperry-math

Oh sorry, to be more clear the behavior of interpreting Vec<u8> and slices as well as string is based on this: https://swagger.io/docs/specification/describing-request-body/file-upload/

juhaku avatar Apr 11 '23 15:04 juhaku

Adding a bit more detail, the following is the best openapi has for binary stream

type: string
format: binary

It isnt valid JSON Schema as far as I know. This was added in https://github.com/juhaku/utoipa/issues/197

jayvdb avatar Apr 13 '23 08:04 jayvdb

From the two comments above, I gather that the current behavior is the expected behavior, so the workaround where I set an example works fine, at least for now. A strict option seems cleaner than the current workaround, especially if it would be useful in other circumstances.

My unfamiliarity with this may be getting in the way. To be clear: I need the schema both for a response and a request body in Json format, and that's where I stumbled.

johnperry-math avatar Apr 13 '23 13:04 johnperry-math

@jayvdb Thanks for adding context

@johnperry-math Yes this is expected behavior.

Yeah, perhaps it is good to have such attribute. Though I need to check whether you could already workaround this issue with value_type = ... attribute declaration.

#[derive(ToSchema)]
struct Foo {
    #[schema(value_type = Vec<u8>)] // <-- This might be able to make the schema behave 
    value: Vec<u8>,                 // as Vec of bytes (numbers)
}

juhaku avatar Apr 14 '23 13:04 juhaku

@juhaku Alas, value_type = Vec<u8> doesn't seem to work for me.

johnperry-math avatar Apr 14 '23 14:04 johnperry-math

I am also experiencing this issue.

julius-boettger avatar Nov 24 '23 16:11 julius-boettger

#[derive(ToSchema, TryFromMultipart)]
pub struct MediaUpload {
    name: String,
    #[schema(value_type = Vec<Vec<u8>>)]
    media: Vec<FieldData<Bytes>>,
}
#[utoipa::path(
    post,
    path = "/media",
    request_body(content_type = "multipart/form-data", content = MediaUpload),
    responses(
        (status = 200, description = "media uploaded")
    )
)]
#[tracing::instrument(skip_all)]
pub async fn send_tele_media(TypedMultipart(data): TypedMultipart<MediaUpload>) {
    let media = &data.media.first().unwrap().metadata;
    println!("{media:#?}");
    let file_size = data.media.first().unwrap().contents.len();
    println!("size is {file_size}");
}

This works for me. TryFromMultipart trait and FieldData struct is from axum_typed_multipart. I like it better than the default axum multipart extractor.

I'm guessing you weren't able to upload binary because you didn't specify content_type = "multipart/form-data" under your request_body. I could be wrong about this.

leelhn2345 avatar Jun 14 '24 07:06 leelhn2345

This behavior is wrong when not using serde_bytes.

The actually serialized struct does contain a list of numbers, not a byte string. So the documentation is wrong. When using serde_bytes, the string representation is better.

The serde_bytes documentation even mentions that serde cannot treat [u8] any different from other slices due to specialization.

SZenglein avatar Jul 10 '24 15:07 SZenglein