Kandinsky-2 icon indicating copy to clipboard operation
Kandinsky-2 copied to clipboard

What are the benefits?

Open WASasquatch opened this issue 2 years ago • 2 comments

I'm curious, is using AI on multiple languages in the same interpreter, with discrimination, really appropriate? What is the benefit here? Is there any loss in language with them all together over versions a focused larger single language?

When I look at the results from these model, like the example here and model architecture graph, why are they all influenced by the culture of a detected image? For example, here the teddy bear is on a skate board next to iconic Russian landmarks you'd get from simply prompting "Russia". To me this seems like severe contamination from discrimination in detection.

We use Google on a service I work for, because all the multi-lingual CLIP models don't seem to actually work right. By simply translating the prompt before it hits any AI to create anomalies, you get more cohesive results that are "true" so to speak.

These are made on https://idun.ai which uses Google Services to translate from over 100 languages (free to use on anything running python from pip).

Un ours en peluche sur une planche un skateboard image

A teddy bear on a skateboard image

I would consider these results "accurate" because they convey what you want, and it is a neutral environment not contaminated by a detected nationalities most stereotypical landmarks. It correctly represents the dataset for skating, neutral environments where skaters are captured for datasets, and then satisfies teddy bear.

WASasquatch avatar Mar 26 '23 20:03 WASasquatch

Where do you check but results are amazing compared to sd 2.1

27.) Python Script - Jupyter Based - PC - Free Midjourney Level NEW Open Source Kandinsky 2.1 Beats Stable Diffusion - Installation And Usage Guide image

FurkanGozukara avatar Apr 05 '23 22:04 FurkanGozukara

Where do you check but results are amazing compared to sd 2.1

27.) Python Script - Jupyter Based - PC - Free Midjourney Level NEW Open Source Kandinsky 2.1 Beats Stable Diffusion - Installation And Usage Guide image

This doesn't appear to be related to what I'm talking about. I'm talking about stuff like using french for something like above prompt and consistently getting like the Eiffel tower in the prompt, vs translating before AI, and getting what you prompted, just a bear on skateboard.

WASasquatch avatar Apr 05 '23 22:04 WASasquatch