argilla icon indicating copy to clipboard operation
argilla copied to clipboard

Add text2text example (e.g., text summarisation)

Open frascuchon opened this issue 3 years ago • 8 comments

Add the text summarisation fine-tuning tutorial similar to sentiment classifier fine-tuning tutorial:

https://rubrix.readthedocs.io/en/stable/tutorials/06-labeling-finetuning.html#3.-Fine-tune-the-pre-trained-mode

frascuchon avatar Oct 05 '21 13:10 frascuchon

I'd like to work on this, however, it seems the URL provided is outdated/broken.

iakhil avatar Oct 28 '21 09:10 iakhil

Hi @iakhil ! You are right!

I think the best for this issue is to add a text summarization example here:

https://rubrix.readthedocs.io/en/stable/guides/task_examples.html#Text2Text-(Experimental)

The idea is use text summarization pretrained pipeline from Huggingface (https://huggingface.co/transformers/main_classes/pipelines.html#summarizationpipeline) and add the example in this notebook (after the machine translation one):

https://github.com/recognai/rubrix/blob/master/docs/guides/task_examples.ipynb

For example:

summarizer = pipeline("summarization")
# return three predictions
predictions = summarizer("A LONG TEXT....", num_return_sequences=3)

# log three predictions in rubrix
# follow the machine translation example

dvsrepo avatar Oct 28 '21 09:10 dvsrepo

Hi @iakhil , did you manage to take a look at this? Let us know if you need support

dvsrepo avatar Nov 05 '21 12:11 dvsrepo

Hey! In order to log predictions in Rubrix, the Docker-contained web app should be running, right? When I try to run it using the first method, I get the following error:

E:\rubrix>wget -O docker-compose.yml https://raw.githubusercontent.com/recognai/rubrix/master/docker-compose.yaml && docker-compose up converted 'https://raw.githubusercontent.com/recognai/rubrix/master/docker-compose.yaml' (ASCII) -> 'https://raw.githubusercontent.com/recognai/rubrix/master/docker-compose.yaml' (UTF-8) --2021-11-06 02:35:07-- https://raw.githubusercontent.com/recognai/rubrix/master/docker-compose.yaml Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.108.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected. Unable to establish SSL connection.

How should I go about it?

iakhil avatar Nov 05 '21 21:11 iakhil

Hi @iakhil

There are some troubles with SSL certificates downloading the docker-compose.yml file.

You can just copy the file content and create it locally. Then launch the docker-compose up command.

I hope this helps you

frascuchon avatar Nov 12 '21 12:11 frascuchon

Thanks! I got it running. Are we supposed to have three predictions (as denote by the value of num_return_sequences)? In Rubrix, I can see only one prediction for both machine translation and text summarization.

iakhil avatar Nov 12 '21 20:11 iakhil

Hi @iakhil, sorry for the late reply. In principle you should be able to view the 3 predictions using the arrows on both the annotation and explore mode. Would you mind to share a screenshot with the records for which you don't see the predictions?

dvsrepo avatar Dec 06 '21 15:12 dvsrepo

Earlier, I had connected to the server locally using python -m rubrix.server but now it doesn't seem to work. I get an error that says ERROR: Application startup failed. Exiting. Any way to fix this?

iakhil avatar Dec 14 '21 15:12 iakhil

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Sep 23 '22 04:09 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Oct 09 '22 03:10 github-actions[bot]

Hey! Can I pick this issue under Hacktoberfest ? if its still pending?

krishnajalan avatar Oct 17 '22 19:10 krishnajalan

Absolutely! It's still pending

dvsrepo avatar Oct 17 '22 19:10 dvsrepo

@krishnajalan @iakhil https://docs.argilla.io/en/latest/guides/tasks/text_generation.html#Text-Summarization

davidberenstein1957 avatar Nov 08 '22 11:11 davidberenstein1957