Luca Soldaini

Results 32 comments of Luca Soldaini

I would recommend running QuickUMLS on [WSL](https://learn.microsoft.com/en-us/windows/wsl/install)

Hey @epwalsh, added the 1B config and set the correct EOS token on both 1B and 7B. Didn't touch any data paths, lmk how you'd to handle it.

yes @IanMagnusson I'm documenting the 1.5 creation process and will PR soon ❤️

Collected a first version of the corpus. Steps I followed are [here](https://github.com/allenai/LLM/blob/soldni/data/scripts/lucas/s2ag/README.md), but a summary is as follows: Data info: - Corpus is located at `s3://ai2-s2-research-public/lucas/s2orc_oa_2022_01_03` - It is comprised...

I've stumbled upon this issue recently, too.

uh, that is pretty confusing! could you post a sample of the data in your yaml file?

hi @mihara-bot! which could you give me more info on the system you are on? you shouldn't need to install rust under x86-64 to use dolma; pypi package should come...

Issues should have been fixed with #66.

This is nice; I will bump in the next version @peterbjorgensen! In the meantime, I recently added support for specifying rules using jq syntax (not the default, but can be...