Gabriele Sarti

Results 29 issues of Gabriele Sarti

### System Info - `transformers` version: 4.21.0.dev0 - Platform: Linux-5.3.0-1017-x86_64-with-glibc2.27 - Python version: 3.9.13 - Huggingface_hub version: 0.8.1 - PyTorch version (GPU?): 1.12.0+cu102 (True) - Tensorflow version (GPU?): not installed...

bug

Similar to what is currently available in `download_model.py`, add Argparse with parameters in `finetune_nli.py` for parameters: - `model_name`, default 'models/scibert', type `str` - `batch_size`, default 64, type `int` - `model_save_path`,...

enhancement
help wanted
good first issue

# What does this PR do? Fixes #18049 following the exact same procedure used in #17437. Beside the added test, I also evaluated the fix on my personal use-case and...

Hello, I'm trying to make the `DistributedBloomForCausalLM` work with our library [`inseq`](https://github.com/inseq-team/inseq) to extract feature attributions from BLOOM generations. However, at the moment I am facing some issues that prevent...

help wanted

Hi all, With the 0.4.0 release ferret now supports the usage of batching for methods requiring multiple steps of approximation, such as Lime and Integrated Gradients. However, the name of...

discussion

Dear authors, Just wanted to point out that a good number of highlights in SCAT sentences appear to be malformed, most likely due to sequential insertion of `hon`/`hoff` tags without...

I am experiencing some difficulties when loading the files using python's Pandas library since they do not appear to be in the standard utf-8 format. I tried to use the...

Hi, I'm trying to run the training script with Python 3.8.10 and `torch==1.10.2+cu113`, and I obtain the following error: ```shell >> bash thualign/bin/train.sh -s mask_align -e agree_deen running mask_align Traceback...

## Description This PR addresses the possibility of skipping special tokens during attribution using the `skip_special_tokens=True` argument. Example usage (regular attribution): ```python import inseq model = inseq.load_model("mymusise/CPM-Generate-distill", "integrated_gradients") out =...

## Description Implements the rollout aggregation function originally described by [Abnar and Zuidema (2020)](https://aclanthology.org/2020.acl-main.385/), and later applied for encoder-decoder attribution by [Ferrando et al. (2022)](https://aclanthology.org/2022.emnlp-main.599/). ### Notes This implementation was...