UGATIT
UGATIT copied to clipboard
Pretrained model?
Do you have any pretrained model weights? I currently can't train something like this so was curious if you had anything pretrained available.
I'm talking to the company about whether it's okay to release a pre-trained model. Please wait a little. Sorry.
I'm talking to the company about whether it's okay to release a pre-trained model. Please wait a little. Sorry.
I train the model in my own dataset. The result looks not very well. Hopefully you share your pre-trained model @taki0112
We want to make anime, please
If you open a patreon or something, we can subscribe for your pre-trained model 🗡
Alternatively, it would be amazing if you could share the selfie2anime
dataset.
Alternatively, it would be amazing if you could share the
selfie2anime
dataset.
See issue #6
Can't wait to want a pre-trained model~ please~
Let's hope the company will allow you to place the model
It would be really helpful if you could release some existing model for our reference. Please~
Guys just chill for a moment. Taki already said that they're talking to the company about it. It's been 2 days. Calm down and wait. They know that we want this, flooding won't help.
In the meantime, why don't you try something yourself? You can use Microsofts Azure or Amazon AWS to train this type of network. Maybe you can even come up with something better! Who knows right?
The thing we need to understand is that no one likes begging and pleading. These people have worked hard on something, and it's completely up to them if they choose to release their models or datasets. I appreciate the fact that they open-sourced their code. Personally, I wouldn't mind even paying for their models and dataset. In the meantime let's stop flooding this thread and wait for @taki0112 's response.
If you don't want to share, and I can understand. You should create a website that offer the possibility to convert photo into anime. The website will become very popular. And you can get some money with the publicity
I have published a pre-trained model for cat2dog
on kaggle. Please let me know if you have any issues with it. I saved the results in this pdf so you can see what it looks like:
results.pdf I used the cat2dog dataset from DRIT.
It takes 4+ days to train cropped face dataset and 16+ days to train cropped body dataset on Nvidia GPUs (estimates). Since it takes many days to train the dataset once, and it takes many iterations of training it will take some time but eventually many people will publish and share their pre-trained models in the weeks to come. Datasets can be found at DRIT. For selfie2anime
you can use datasets selfie and anime face dataset. Other potential anime face dataset sources: thiswaifudoesnotexist
,
animeGAN for generating anime images and a one click download anime face dataset.
UGATIT
is quite general, you really just need a folder of anime faces and a folder of human faces and it figures the rest by itself.
@thewaifuai Hello! Would you like to publish the pre-trained model of selfie2anime in future? Thanks.
@thewaifuai Hello! Would you like to publish the pre-trained model of selfie2anime in future? Thanks.
Yes
FYI, I'm using a quickly-assembled, crappy dataset and a relatively slow cloud GPU machine. Also, I reduced the resolution to 100x100 pixels (256 just takes too long for me). The results look like this after one day of training:


Not too bad, but still a lot of room for improvement :)
What I can recommend if you'd like to create a better one:
- Make sure the two datasets have similar poses / distances to the face. You can tell in mine that the anime data is much more close-up to the face and so the model learned that part of the transformation is "zooming in".
- Make sure the anime dataset is diverse. Right now, in my model, everything from black men to old women gets transformed into 12-yo-looking girls with giant eyes, white skin, and bangs. I'd really rather it learns something more diverse...
- Get a serious cloud machine and expect to spend some time. The batch size of 1 is killing me 😅
FYI, I'm using a quickly-assembled, crappy dataset and a relatively slow cloud GPU machine. Also, I reduced the resolution to 100x100 pixels (256 just takes too long for me). The results look like this after one day of training:
![]()
![]()
Not too bad, but still a lot of room for improvement :)
What I can recommend if you'd like to create a better one:
- Make sure the two datasets have similar poses / distances to the face. You can tell in mine that the anime data is much more close-up to the face and so the model learned that part of the transformation is "zooming in".
- Make sure the anime dataset is diverse. Right now, in my model, everything from black men to old women gets transformed into 12-yo-looking girls with giant eyes, white skin, and bangs. I'd really rather it learns something more diverse...
- Get a serious cloud machine and expect to spend some time. The batch size of 1 is killing me 😅
FYI, I'm using a quickly-assembled, crappy dataset and a relatively slow cloud GPU machine. Also, I reduced the resolution to 100x100 pixels (256 just takes too long for me). The results look like this after one day of training:
![]()
![]()
Not too bad, but still a lot of room for improvement :)
What I can recommend if you'd like to create a better one:
- Make sure the two datasets have similar poses / distances to the face. You can tell in mine that the anime data is much more close-up to the face and so the model learned that part of the transformation is "zooming in".
- Make sure the anime dataset is diverse. Right now, in my model, everything from black men to old women gets transformed into 12-yo-looking girls with giant eyes, white skin, and bangs. I'd really rather it learns something more diverse...
- Get a serious cloud machine and expect to spend some time. The batch size of 1 is killing me 😅
Can you share your training dataset?
Or Pretrained model?
Thanks Very Much!~
This is my email:[email protected]
I have published a pre-trained model for
cat2dog
on kaggle. Please let me know if you have any issues with it. I saved the results in this pdf so you can see what it looks like: results.pdf I used the cat2dog dataset from DRIT.
@thewaifuai I'm not sure why but your cat2dog kaggle link doesn't work?
I have published a pre-trained model for
cat2dog
on kaggle. Please let me know if you have any issues with it. I saved the results in this pdf so you can see what it looks like: results.pdf I used the cat2dog dataset from DRIT.@thewaifuai I'm not sure why but your cat2dog kaggle link doesn't work?
Oops kaggle datasets are private by default, I had to manually make it public. It is now public and should work.
I have published a pre-trained model for
cat2dog
on kaggle. Please let me know if you have any issues with it. I saved the results in this pdf so you can see what it looks like: results.pdf I used the cat2dog dataset from DRIT.I am actively working on writing a
TPU
version ofUGATIT
. If anyone is interested please respond to my UGATIT TPU issue. I am interested with working with others to make the TPU version.It takes 4+ days to train cropped face dataset and 16+ days to train cropped body dataset on Nvidia GPUs (estimates). Since it takes many days to train the dataset once, and it takes many iterations of training it will take some time but eventually many people will publish and share their pre-trained models in the weeks to come. Datasets can be found at DRIT. For
selfie2anime
you can use datasets selfie and anime face dataset. Other potential anime face dataset sources:thiswaifudoesnotexist
, animeGAN for generating anime images and a one click download anime face dataset.UGATIT
is quite general, you really just need a folder of anime faces and a folder of human faces and it figures the rest by itself.
Should the images in trainA and trainB be of same sizes? the selfies are 306x306 but my anime faces were 512x512 mixed pngs and jpgs. I did run into some errors.
This is on a 4x P100 with 11 GB VRAM on each trainA is selfie dataset and trainB is http://www.seeprettyface.com/mydataset_page2.html + 1k dump of male anime from gwern's TWDNEv2 website.
I guess, if I reduce the batch size? then I can quickly train and release the pre-trained models.
@tafseerahmed the size and format of the images shouldn't matter. They get resized anyway AFAIK.
The error you're getting is OOM
- out of memory. I believe you don't have enough available RAM (as opposed to GPU memory) to create the model. Is that possible?
@tafseerahmed use the --light True
option, if that does not work run pkill python3
and then try again with the --light True
option. This runs the light version of UGATIT.
@tafseerahmed the size and format of the images shouldn't matter. They get resized anyway AFAIK.
The error you're getting is
OOM
- out of memory. I believe you don't have enough available RAM (as opposed to GPU memory) to create the model. Is that possible?
Someone is using 2 GPU's right now but I still have over 256GB of RAM available.
@tafseerahmed use the
--light True
option, if that does not work runpkill python3
and then try again with the--light True
option. This runs the light version of UGATIT.
wouldn't that reduce the quality of final results?
@tafseerahmed use the
--light True
option, if that does not work runpkill python3
and then try again with the--light True
option. This runs the light version of UGATIT.wouldn't that reduce the quality of final results?
Yes
@tafseerahmed use the
--light True
option, if that does not work runpkill python3
and then try again with the--light True
option. This runs the light version of UGATIT.wouldn't that reduce the quality of final results?
Yes
lol thanks its training now
but did you train yours on the heavy model instead of light? I imagine the full model requires more than 16GB VRAM
The light version significantly reduces the capacity of the model. I haven't trained for long but I don't think it's worth trying.
With that hardware, you really should not have any memory issues. Maybe the dataset is too big and already takes up most the memory? I don't know but I think you should investigate / experiment more.
The light version significantly reduces the capacity of the model. I haven't trained for long but I don't think it's worth trying.
With that hardware, you really should not have any memory issues. Maybe the dataset is too big and already takes up most the memory? I don't know but I think you should investigate / experiment more.
the batch size was set to 1 by default (that's ineffective when you have a GPU), so I can't imagine that the hardware was an issue. I will debug more and let you guys know, in the meantime, I am training on the light model.
Yeah, batch size of 1 is necessary for cycle GANs.
Another thing I've learned: You can increase the speed of your training quite significantly by already providing the right image size. Because otherwise the training procedure will take loads of time just resizing images. Here's what it says in the paper:
All models are trained using Adam [19] with β1=0.5 and β2=0.999. For data augmentation, we flipped the images horizontally with a probability of 0.5, resized them to 286 x 286, and random cropped them to 256 x 256.
I would first use imagemagick to batch-resize all your data to 286x286 or similar. I think that could save you a day or so in training time.