flair icon indicating copy to clipboard operation
flair copied to clipboard

feat: Add chunking function for sequence tagger training on sentences exceeding token limit

Open MattGPT-ai opened this issue 1 year ago • 4 comments

Closes https://github.com/flairNLP/flair/issues/3519

Adds a Sentence chunking function to allow SequenceTagger training on sentences exceeding the token limit. Adds tests for this function

MattGPT-ai avatar Aug 03 '24 02:08 MattGPT-ai

Looks like 100% of my tests passed, but it still says my checks failed in the GitHub UI

MattGPT-ai avatar Aug 09 '24 17:08 MattGPT-ai

We are getting a System.IO.IOException: No space left on device error for the unit tests as they seem to be taking up too much space. I tried removing some of the dataset downloads in the tests in #3526, but it seems its not enough to prevent this from happening.

alanakbik avatar Aug 09 '24 18:08 alanakbik

We are getting a System.IO.IOException: No space left on device error for the unit tests as they seem to be taking up too much space. I tried removing some of the dataset downloads in the tests in #3526, but it seems its not enough to prevent this from happening.

Is it possible to just download portions of the datasets? Like 100 samples or something sufficient for unit testing

MattGPT-ai avatar Aug 09 '24 20:08 MattGPT-ai

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 26 '25 04:04 stale[bot]