Gabriele Sarti issues

Results 29 issues of


                                            Gabriele Sarti

NaN in XGLM Softmax with FP16

### System Info - `transformers` version: 4.21.0.dev0 - Platform: Linux-5.3.0-1017-x86_64-with-glibc2.27 - Python version: 3.9.13 - Huggingface_hub version: 0.8.1 - PyTorch version (GPU?): 1.12.0+cu102 (True) - Tensorflow version (GPU?): not installed...

bug

Add argparse parametrization for the finetuning script

Similar to what is currently available in `download_model.py`, add Argparse with parameters in `finetune_nli.py` for parameters: - `model_name`, default 'models/scibert', type `str` - `batch_size`, default 64, type `int` - `model_save_path`,...

enhancement

help wanted

good first issue

XGLM - Fix Softmax NaNs when using FP16

# What does this PR do? Fixes #18049 following the exact same procedure used in #17437. Beside the added test, I also evaluated the fix on my personal use-case and...

:hugs: transformers compatibility issues

Hello, I'm trying to make the `DistributedBloomForCausalLM` work with our library [`inseq`](https://github.com/inseq-team/inseq) to extract feature attributions from BLOOM generations. However, at the moment I am facing some issues that prevent...

help wanted

`batch_size` potentially misleading

Hi all, With the 0.4.0 release ferret now supports the usage of batching for methods requiring multiple steps of approximation, such as Lime and Integrated Gradients. However, the name of...

discussion

Malformed highlight tags

Dear authors, Just wanted to point out that a good number of highlights in SCAT sentences appear to be malformed, most likely due to sequential insertion of `hon`/`hoff` tags without...

Texts-Together-OneCSVperFile are not in UTF-8

I am experiencing some difficulties when loading the files using python's Pandas library since they do not appear to be in the standard utf-8 format. I tried to use the...

Issue with namespace using train.sh

Hi, I'm trying to run the training script with Python 3.8.10 and `torch==1.10.2+cu113`, and I obtain the following error: ```shell >> bash thualign/bin/train.sh -s mask_align -e agree_deen running mask_align Traceback...

Add possibility to skip special tokens during attribution

## Description This PR addresses the possibility of skipping special tokens during attribution using the `skip_special_tokens=True` argument. Example usage (regular attribution): ```python import inseq model = inseq.load_model("mymusise/CPM-Generate-distill", "integrated_gradients") out =...

Rollout aggregation function

## Description Implements the rollout aggregation function originally described by [Abnar and Zuidema (2020)](https://aclanthology.org/2020.acl-main.385/), and later applied for encoder-decoder attribution by [Ferrando et al. (2022)](https://aclanthology.org/2022.emnlp-main.599/). ### Notes This implementation was...