fastdup icon indicating copy to clipboard operation
fastdup copied to clipboard

[Bug]: Error in atrain_crops csv build

Open TamirBar-Tov opened this issue 1 year ago • 7 comments

What happened?

Hi! I run the code on windows (pycharm) and the atrain_crops file doesn't build preperly. When I run the same code on colab it works well. It somewhre in the dll.do_main function.

filename: ???????? ???????? ??????????????? ???????? ???????? ???????? ???????? ???????? ??????????? ????????

What did you expect to see?

No response

What version of fastdup were you runnning on?

2.2

What version of Python were you running on?

Python 3.9

Operating System

Windows

Reproduction steps

No response

Relevant log output

index	filename	crop_filename	col_x	row_y	width	height	label	confidence
0	????????	??????????????????????????g460_64.dll	1525	806	1989	1096	?g	0.76063
1	????????	???????????????????????????	3506	1310	4053	1540	?g	0.66642
2	???????????????	??????????????????????????g_1462.jpg	2606	58	3412	1520	??e???r	0.67283
3	????????	??????????????????????????g2_168.jpg	2279	777	2621	945	?g	0.35012
4	????????	???????????????????????????	1276	1283	1560	1857	?g	0.4677
5	????????	???????????????????????????	34	3112	351	3227	??????t	0.33369
6	????????	???????????????????????????g173.jpg	5022	1566	5670	2873	?t	0.78459
7	????????	???????????????????????????g115.jpg	43	4979	561	5152	???????	0.32928
8	???????????	???????????????????????????g30_454.jpg	2628	1030	3658	1484	?g	0.64583
9	????????	??????????????????????????g	2090	822	2382	1077	?g	0.64642

Attach a screenshot [Optional]

No response

Contact Details [Optional]

[email protected]

TamirBar-Tov avatar Sep 29 '24 11:09 TamirBar-Tov

Hi @TamirBar-Tov this is related to locale encoding on windows. You need to compare the environment variables between jupyter and pycharm to see where they issue is coming from.

This is from ChatGPT:

Font issues in PyCharm versus Jupyter when using pandas.to_csv could be influenced by several environment variables or settings:

Locale Settings: Ensure that the locale in PyCharm is set correctly. Jupyter may be using UTF-8 encoding by default, while PyCharm could be defaulting to a different encoding (e.g., ANSI).

Check the PYTHONIOENCODING environment variable in PyCharm. Set it to utf-8 if not already set:

PYTHONIOENCODING=utf-8

Console Encoding: PyCharm's console may not handle special characters or encodings as well as Jupyter's interface. Check the console encoding settings in PyCharm:

Go to File > Settings > Editor > File Encodings and ensure UTF-8 is set for "Global Encoding" and "Project Encoding." Pandas Display Options: Pandas might display text differently based on the environment. Try enforcing the encoding when saving the CSV file:

python Copy code df.to_csv('file.csv', encoding='utf-8') Font in PyCharm: If PyCharm is using a font that doesn't support special characters, change the font under File > Settings > Editor > Font to something like Consolas or another monospaced font that supports Unicode.

System Locale: On Windows, the system locale may differ between applications. Ensure that the locale for non-Unicode programs is set to a UTF-8 compatible option.

dbickson avatar Sep 29 '24 11:09 dbickson

Thanks for your response. I'm working with Micky Fire on an article and using your platform, we talk before if you remember.. I tried to do what you told me and few other things but only this file return with this problem. The others are good. It's something inside dll.do_main() Thanks Tamir

On Sun, Sep 29, 2024, 14:08 Danny Bickson @.***> wrote:

Hi @TamirBar-Tov https://github.com/TamirBar-Tov this is related to locale encoding on windows. You need to compare the environment variables between jupyter and pycharm to see where they issue is coming from.

This is from ChatGPT:

Font issues in PyCharm versus Jupyter when using pandas.to_csv could be influenced by several environment variables or settings:

Locale Settings: Ensure that the locale in PyCharm is set correctly. Jupyter may be using UTF-8 encoding by default, while PyCharm could be defaulting to a different encoding (e.g., ANSI).

Check the PYTHONIOENCODING environment variable in PyCharm. Set it to utf-8 if not already set:

PYTHONIOENCODING=utf-8

Console Encoding: PyCharm's console may not handle special characters or encodings as well as Jupyter's interface. Check the console encoding settings in PyCharm:

Go to File > Settings > Editor > File Encodings and ensure UTF-8 is set for "Global Encoding" and "Project Encoding." Pandas Display Options: Pandas might display text differently based on the environment. Try enforcing the encoding when saving the CSV file:

python Copy code df.to_csv('file.csv', encoding='utf-8') Font in PyCharm: If PyCharm is using a font that doesn't support special characters, change the font under File > Settings > Editor > Font to something like Consolas or another monospaced font that supports Unicode.

System Locale: On Windows, the system locale may differ between applications. Ensure that the locale for non-Unicode programs is set to a UTF-8 compatible option.

— Reply to this email directly, view it on GitHub https://github.com/visual-layer/fastdup/issues/348#issuecomment-2381314347, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQEMOPH5QUBAMYB6MD3P4CDZY7NTRAVCNFSM6AAAAABPBS7XUKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBRGMYTIMZUG4 . You are receiving this because you were mentioned.Message ID: @.***>

TamirBar-Tov avatar Sep 29 '24 19:09 TamirBar-Tov

hi @TamirBar-Tov say hi to Micky Fire! I suggest moving to Linux or MacOS. This is Windows crap.

dbickson avatar Sep 30 '24 09:09 dbickson

Ok I will do it. BTW, I'm building a very cool tool with your platform which can be used by you as a service to your users. I can show you If you want. Tamir

‫בתאריך יום ב׳, 30 בספט׳ 2024 ב-12:26 מאת ‪Danny Bickson‬‏ <‪ @.***‬‏>:‬

hi @TamirBar-Tov https://github.com/TamirBar-Tov say hi to Micky Fire! I suggest moving to Linux or MacOS. This is Windows crap.

— Reply to this email directly, view it on GitHub https://github.com/visual-layer/fastdup/issues/348#issuecomment-2382600669, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQEMOPBL5LDYPBEC7GB3VELZZEKN7AVCNFSM6AAAAABPBS7XUKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBSGYYDANRWHE . You are receiving this because you were mentioned.Message ID: @.***>

TamirBar-Tov avatar Sep 30 '24 10:09 TamirBar-Tov

sure we would love to see it!

dbickson avatar Sep 30 '24 10:09 dbickson

In the project that I'm doing now I needed to train yolo model, but I didn't have examples with annotation files. So I used your model to create those examples. I run it on all the examples and chose the ones with good detection. I write now as yolo examples generator tool and I think that it can be very good for others as well. here are some examples: Good: [image: image.png]

[image: image.png]

not good: [image: image.png] [image: image.png]

In that way I created more than 1000 examples. I'm working on a paper now, and hope to publish it :) Tamir

http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail Virus-free.www.avg.com http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

‫בתאריך יום ב׳, 30 בספט׳ 2024 ב-13:21 מאת ‪Danny Bickson‬‏ <‪ @.***‬‏>:‬

sure we would love to see it!

— Reply to this email directly, view it on GitHub https://github.com/visual-layer/fastdup/issues/348#issuecomment-2382716979, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQEMOPDLQ7EO6VID66SRCCDZZEQZVAVCNFSM6AAAAABPBS7XUKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBSG4YTMOJXHE . You are receiving this because you were mentioned.Message ID: @.***>

TamirBar-Tov avatar Oct 01 '24 13:10 TamirBar-Tov

Hi @TamirBar-Tov the examples where not received well. Can you resend to my email danny [ AT ] visual - layer [ DOT ] com

dbickson avatar Oct 01 '24 14:10 dbickson