BigScience Workshop
Results
17
repositories owned by
BigScience Workshop
bigscience
949
Stars
99
Forks
Watchers
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
biomedical
419
Stars
111
Forks
Watchers
Tools for curating biomedical training data for large-scale language modeling
Megatron-DeepSpeed
1.2k
Stars
204
Forks
Watchers
Ongoing research training transformer language models at scale, including: BERT & GPT-2
promptsource
2.5k
Stars
337
Forks
Watchers
Toolkit for creating, sharing and using natural language prompts.
t-zero
444
Stars
51
Forks
Watchers
Reproduce results and replicate training fo T0 (Multitask Prompted Training Enables Zero-Shot Task Generalization)
data-preparation
285
Stars
40
Forks
Watchers
Code used for sourcing and cleaning the BigScience ROOTS corpus
data_sourcing
31
Stars
6
Forks
Watchers
This directory gathers the tools developed by the Data Sourcing Working Group