Juan Manuel Pérez

Results 35 comments of Juan Manuel Pérez

Do not understand why this fails, as the tokenizer has the `model_max_len` property set. Please report on `transformers`.

@alexvaca0 Thanks for your interest! We will be publishing the original tweets soon, hopefully in `datasets`. Leave this issue open so we let you know when they are available.

Hi @alexvaca0. I'm having some problems regarding the original tweets -- that is, the raw tweets prior to any preprocessing and filtering. The machine which contained this data is not...

Well, this is quite late, but finally, the tweets were released. I could only upload half of them, but I suppose this might be enough (~300M tweets). Check https://huggingface.co/datasets/pysentimiento/spanish-tweets In...

Are you planning to publish the EDGAR instances of the dataset?