Yen-Chun Chen
Yen-Chun Chen
This might be a bug. I did not test 2 layers since 1 layer already give good results. I will investigate this ASAP.
Thanks for pointing this. I will investigate this.
Even if there's only one sentence [batch_size](https://github.com/ChenRocks/fast_abs_rl/blob/aebf539107caba5be35720f5d1f9f98989a069e8/data/batcher.py#L115) would be 1 and the function properly handles this. Do you have the correct data format?
Thanks for the explanation. I understand the issue now. If you are working on a dataset that only has 1-sentence summary output this repository might not be the best choice...
Thanks for pointing this out! I think your solution should work as intended. I will test how this affect the results when I have time.
Currently there is no such a function. The training should finish in reasonable time using GPU and I currently have no plan for supporting this. Feel free to contribute and...
You should not see the apex loss scaler reducing the loss scale to less than 1. ``` [1,0]:Gradient overflow. Skipping step, loss scaler 5 reducing loss scale to 4.3601508761683463e-106 ```...
Thanks for your inquiry. > My questions are: > > 1. Is this issue with batched requests fixed with Phi-3.5-vision? (I see batch size = 64 in [this training script](https://github.com/microsoft/Phi-3CookBook/blob/main/md/04.Fine-tuning/FineTuning_Vision.md#docvqa-note-phi-3-vision))...
Thanks @ladanisavan for your inquiry. Unfortunately, BBox support is currently not available in Phi-3.x-vision. We appreciate this feedback and will discuss this feature request for future versions. In the meanwhile,...
Hi @qwedaq, thanks for reporting your results. Note that all deep learning training has inherent randomness; therefore, it is possible that a re-run results in slight accuracy difference. However, in...