SDMetrics icon indicating copy to clipboard operation
SDMetrics copied to clipboard

Gh stronger detection classifiers

Open TanguyUrvoy opened this issue 3 years ago • 7 comments

Add Random Forest and Gradient Boosting from sklearn to the single table detection tests. Being able to fool these classifiers would be a great improvement for generative models.

TanguyUrvoy avatar Mar 10 '22 10:03 TanguyUrvoy

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Mar 10 '22 10:03 CLAassistant

Hi, I changed the class names and checked the tests.

TanguyUrvoy avatar May 13 '22 09:05 TanguyUrvoy

Codecov Report

Merging #120 (cf22966) into master (122f42e) will increase coverage by 0.18%. The diff coverage is 93.10%.

@@            Coverage Diff             @@
##           master     #120      +/-   ##
==========================================
+ Coverage   50.84%   51.03%   +0.18%     
==========================================
  Files          51       51              
  Lines        1530     1540      +10     
==========================================
+ Hits          778      786       +8     
- Misses        752      754       +2     
Impacted Files Coverage Δ
sdmetrics/__init__.py 39.13% <ø> (ø)
...dmetrics/column_pairs/statistical/kl_divergence.py 61.53% <ø> (ø)
sdmetrics/errors.py 100.00% <ø> (ø)
sdmetrics/single_column/statistical/cstest.py 68.42% <ø> (ø)
sdmetrics/single_column/statistical/kstest.py 72.22% <ø> (ø)
sdmetrics/single_table/__init__.py 100.00% <ø> (ø)
sdmetrics/single_table/base.py 35.71% <ø> (ø)
sdmetrics/single_table/privacy/loss.py 39.13% <ø> (ø)
sdmetrics/timeseries/ml_scorers.py 21.56% <ø> (ø)
sdmetrics/single_table/detection/sklearn.py 78.37% <83.33%> (+1.45%) :arrow_up:
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update eacec8a...cf22966. Read the comment docs.

codecov-commenter avatar May 13 '22 15:05 codecov-commenter

@TanguyUrvoy make lint returns the following errors:

Run invoke lint
No broken requirements found.
ERROR: /home/runner/work/SDMetrics/SDMetrics/sdmetrics/single_table/__init__.py Imports are incorrectly sorted.
ERROR: /home/runner/work/SDMetrics/SDMetrics/sdmetrics/single_table/base.py Imports are incorrectly sorted.
ERROR: /home/runner/work/SDMetrics/SDMetrics/sdmetrics/single_table/detection/sklearn.py Imports are incorrectly sorted.
ERROR: /home/runner/work/SDMetrics/SDMetrics/sdmetrics/single_table/detection/__init__.py Imports are incorrectly sorted.
ERROR: /home/runner/work/SDMetrics/SDMetrics/tests/integration/single_table/test_single_table.py Imports are incorrectly sorted.
Error: Process completed with exit code 1.

You can run make fix-lint to correctly sort the imports, and make lint to verify that the imports are sorted correctly.

katxiao avatar May 16 '22 15:05 katxiao

Thanks, It seems OK now 😊

De : Katharine Xiao @.> Envoyé : lundi 16 mai 2022 17:50 À : sdv-dev/SDMetrics @.> Cc : URVOY Tanguy INNOV/IT-S @.>; Mention @.> Objet : Re: [sdv-dev/SDMetrics] Gh stronger detection classifiers (PR #120)

@TanguyUrvoyhttps://github.com/TanguyUrvoy make lint returns the following errors:

Run invoke lint

No broken requirements found.

ERROR: /home/runner/work/SDMetrics/SDMetrics/sdmetrics/single_table/init.py Imports are incorrectly sorted.

ERROR: /home/runner/work/SDMetrics/SDMetrics/sdmetrics/single_table/base.py Imports are incorrectly sorted.

ERROR: /home/runner/work/SDMetrics/SDMetrics/sdmetrics/single_table/detection/sklearn.py Imports are incorrectly sorted.

ERROR: /home/runner/work/SDMetrics/SDMetrics/sdmetrics/single_table/detection/init.py Imports are incorrectly sorted.

ERROR: /home/runner/work/SDMetrics/SDMetrics/tests/integration/single_table/test_single_table.py Imports are incorrectly sorted.

Error: Process completed with exit code 1.

You can run make fix-lint to correctly sort the imports, and make lint to verify that the imports are sorted correctly.

— Reply to this email directly, view it on GitHubhttps://github.com/sdv-dev/SDMetrics/pull/120#issuecomment-1127841417, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADEJ2N2LBW3XWDYQ7PIFUCTVKJVC3ANCNFSM5QMHGPLA. You are receiving this because you were mentioned.Message ID: @.@.>>

Orange Restricted


Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.

TanguyUrvoy avatar May 17 '22 11:05 TanguyUrvoy

@TanguyUrvoy We're still seeing lint issues. The lint fixes should only be on the files that you are modifying, because the other files already pass the lint. Could you make sure you're installing the correct dependencies by running make install-develop in a clean environment?

katxiao avatar May 19 '22 17:05 katxiao

I do not understand why it fails on parts like timeseries which are not concerend by my changes.

Or maybe it was the import reordering with make fix-lint which induced these errors.

Tanguy


De : URVOY Tanguy INNOV/IT-S Envoyé : vendredi 20 mai 2022 10:19:46 À : sdv-dev/SDMetrics; sdv-dev/SDMetrics Cc : Mention Objet : RE: [sdv-dev/SDMetrics] Gh stronger detection classifiers (PR #120)

Thanks for your patience 😊

I hope this version will succeed though the tests ...

-- Tanguy URVOY IT-S/DIESE/DIA/PROF +33 786 848 899


De : Katharine Xiao @.***> Envoyé : jeudi 19 mai 2022 19:15:16 À : sdv-dev/SDMetrics Cc : URVOY Tanguy INNOV/IT-S; Mention Objet : Re: [sdv-dev/SDMetrics] Gh stronger detection classifiers (PR #120)

@TanguyUrvoyhttps://github.com/TanguyUrvoy We're still seeing lint issues. The lint fixes should only be on the files that you are modifying, because the other files already pass the lint. Could you make sure you're installing the correct dependencies by running make install-develop in a clean environment?

— Reply to this email directly, view it on GitHubhttps://github.com/sdv-dev/SDMetrics/pull/120#issuecomment-1131975611, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADEJ2N6AKSANJQZLZ2ISLZLVKZZKJANCNFSM5QMHGPLA. You are receiving this because you were mentioned.Message ID: @.***>


Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified. Thank you.

TanguyUrvoy avatar May 20 '22 08:05 TanguyUrvoy