Frédéric Branchaud-Charron

https://dref360.github.io/ [email protected]

@GlowstickAI Montréal, Canada Baal/Azimuth Maintainer - 💜 FOSS, ML, MLOps

Results 137 comments of


                                            Frédéric Branchaud-Charron

Add self-training option

That's an interesting idea. Would you have some references we could look into it? It's similar to semi-supervised learning which [we have investigated](https://github.com/baal-org/baal/tree/master/experiments/ssl_experiments) but not really maintained. It became a...

`drop_duplicates` method

There is an open issue #2514 about this which also proposes solutions.

Bug: Type Mismatch in Dataset Mapping

Hello, thanks for submitting an issue. FWIU, the issue is that `datasets` tries to limit casting [ref](https://github.com/huggingface/datasets/blob/ca58154bba185c1916ca5eea4e33b27258642044/src/datasets/arrow_writer.py#L526) and as such will try to convert your strings back to int to...

Investigate saliency task - is memory leaking?

So we should just add `hf_pipeline.model.zero_grad()` after we unhook the hooks? https://github.com/ServiceNow/azimuth/blob/main/azimuth/modules/model_contracts/hf_text_classification.py#L169

Support for streaming metrics

In 1) you talk about a flag-enabled feature. This would fix 3) I think. ```python # This work mf = flm.MetricFrame('accuracy_score', [], [], sensitive_features=[], streaming=True) # This should raise mf...

Support for streaming metrics

Yes let's start with an unefficient v1. I'll try to get a PR opened before the next meeting in January

Update items in the dataset without `map`

Hello! Have you looked at `Dataset.shard`? [Docs](https://huggingface.co/docs/datasets/en/process#shard) Using this method you could break your dataset in N shards. Apply `map` on each shard and concatenate them back.

‹
1
2
...
5
6
7
8
9
10
11
12
13
14