BiRefNet
BiRefNet copied to clipboard
the result from guda is different from that from cpu when testing birefnet_hr
Hello, I have a curious thing after testing your new birefnet_hr. Having an image where 5 people standing in a relatively dim background, I ran your example code posted on huggingface on Google Colab T4. The resultant image shows only 4 people except the right-most person. However, I modified a little bit the sample code so as to run the same image on cpu-only (32bit float) without cuda. I got a different result where all 5 people show. This test result are same with hr or hr_matting. Is it an expected or intended result? Let me attach the mask image out of gpu and cpu in order as follows.
Wow, that's amazing. Theoretically, FP32 / FP16 and CUDA / CPU have ~0 and 0 differences. I also did the test on many examples before.
Could you provide me with the original case image? I can have a test on it.
Thanks for reply it. Let me just attach here. If there is any better way how you receive, please make a note here.
BTW, did you correctly set the resolution? I obtained similar results with BiRefNet_HR + 1024x1024.
Make sure that BiRefNet_HR takes 2048x2048 inputs while BiRefNet takes 1024x1024 inputs.
Sure. The sample code has the transform.Compose() as you know. I just used that as it is. The first transform is (2048,2048).
BTW, did you correctly set the resolution? I obtained similar results with
BiRefNet_HR + 1024x1024. Make sure thatBiRefNet_HRtakes2048x2048inputs whileBiRefNet_HRtakes1024x1024inputs.
I'm assuming you meant BiRefNet takes 1024x1024 rather than _HR
Yeah, my mistake on that typo. I've fixed it.