Said Taghadouini
Results
12
comments of
Said Taghadouini
Sorry for the late response, The code is compatible with HF model implementations, so you can start from any HF BERT model, you can achieve that by defining the model...
In general I think the FA2 support on Windows is not well tested(https://github.com/Dao-AILab/flash-attention?tab=readme-ov-file#installation-and-features). We only ever used Linux machines for the pre-training part.