feat(weave): LangFair Integration
Description
This PR creates two scorer classes; CounterfactualScorer & ToxicityScorer. These scorers compute counterfactual and toxicity metrics supported by the LangFair.
Addresses: https://github.com/wandb/weave/issues/4039
What does the PR do? Include a concise description of the PR contents.
This PR includes two Scorer classes, unit tests for these classes, & example notebook to illustrate a working implementation of these scorer (note: we will remove/change location as per reviewers suggestion).
Counterfactual Scorer: Input an LLM propmts to this class, & this class will identify protected words (gender or race related) in the prompts/questions, create counterfactual prompts, generate counterfactual responses, and compute metric values supported by LangFair ('Cosine', 'RougeL', 'Bleu', 'Sentiment Bias'). Toxicity Scorer: This class gives an measure of toxicity present in the LLM response using a classifier supported by LangFair.
Testing
How was this PR tested?
The PR was tested using the unit tests that can be find in following files
tests/scorers/test_counterfactual_scorer.py tests/scorers/test_toxicity_scorer.py
The test cassettes are available in following directory (tests/integration/langfair/cassettes/langfair_test). These cassettes are generated using tests in "tests/integration/langfair/langfair_test.py" file, which contains all the unit tests from test_counterfactual_scorer.py and test_toxicity_scorer.py.
This PR requires manual approval from a wandb user to run all CI checks.
To see the current diff, click here.
To approve CI for this PR as of this commit, comment:
/approve 4394f204efafa3e45180426805f66a7d8b9256c3
Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=04764985904517d17a73e6c3aaa2a03d23737a12
Initial comments by abraham-leal
Hi guys! Thank you so much for the PR! There are a few procedural things to take care of here:
- [x] 1. Please generate test casettes with our testing frame work, more information here: https://github.com/wandb/weave/blob/master/CONTRIBUTING.md#testing and here: https://github.com/wandb/weave/tree/365b0b45e68a92ec8abb76425637c6a0ca9ffcd5/weave/integrations#readme
- [x] 2. Please ensure to use nox to lint
- [x] 3. Please add any dependencies in the pyproj under optional in the scorer section: https://github.com/wandb/weave/blob/master/pyproject.toml
- [x] 4. Instead of a notebook, please add an integration description under local scorers in our docs, a sample PR of another scorer doing this is here: https://github.com/wandb/weave/pull/3698/files
- [x] 5. Please change the scorer name to indicate its langfair provenance, something like LangfairToxicityScorer or something like that
After those changes, it will be a much more complete PR and we can test it locally and in our suite :) THANK YOU SO MUCH AGAIN.
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
| Diff | Package | Supply Chain Security |
Vulnerability | Quality | Maintenance | License |
|---|---|---|---|---|---|---|
| pypi/langfair@0.6.7 |
[!WARNING] Review the following alerts detected in dependencies.
According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.
| Action | Severity | Alert (click "▶" to expand/collapse) |
|---|---|---|
| Warn |
|
|
| Warn |
|
@abraham-leal should be ready for review now!
Hi @abraham-leal, I have updated the PR as per your comments. Also, the base branch is set to master branch (wandb:master), I believe that is correct, right?
hey @abraham-leal - let us know if you need anything else from us on this one. Thank you!
Hey @mohitcek can you provide a link to a public workspace to allow us to see the integration in action please?
Hey @mohitcek can you provide a link to a public workspace to allow us to see the integration in action please?
Hi @abraham-leal, I appreciate any help on this; perhaps more details or an example would be helpful.
Hey @mohitcek can you provide a link to a public workspace to allow us to see the integration in action please?
Hi @abraham-leal , here are links to public workspace
- Toxicity assessment: https://wandb.ai/mohitcek-cvs-health/LangFair%20Toxicity%20Score/weave/traces?view=traces_default
- Counterfactual assessment: https://wandb.ai/mohitcek-cvs-health/LangFair%20Counterfactual%20Score/weave/traces?view=traces_default
@abraham-leal @tssweeney let us know if you have any questions or feedback for us. Appreciate your review!
@abraham-leal any updates on this PR? Please let us know.