FlashRAG
FlashRAG copied to clipboard
[Update] new functions and bug fixes
During this PR, I made the following changes:
- Add support for
flash-attention-2
inHFCausalLMGenerator
, and add Llama-3 special token when initializing the model. - Add refiner on multi GPU, hence accelerating the refining process by a lot.
- Fix a bug for
selective-context
for list out of bounds. - Add paths for models in
recomp
and fix loading. - Fix a bug for
intermediate_data.json
saving when usingbm25s
.