Junbum Lee comments

Results 34 comments of


                                            Junbum Lee

polygot 인퍼런스

현재 데모 파라미터는 적당히 잘 나오는 파라미터를 임의로 정한 부분인데요, - temperature 0.5 - top_p 0.95 이 두가지 옵션만 주고 있습니다. (Chatbot의 경우 temperature를 낮게 주는게 조금 더 도움이 됩니다.)

polygot 인퍼런스

https://github.com/Beomi/KoAlpaca/tree/main/webui 공개로 Close 합니다.

모바일 최적화

헤더 만들것

Google Map

하드코딩 -> JSON envs.json으로 처리

When will the code be made public, please?

this repo is under dev, so plz wait :)

When will the code be made public, please?

But you could try this repo now reading the README guide! not yet fully implemented the paper, but worth to try 👍

Model generating random sequence

Did you trained the model? just loading with gate will output just random tokens.

Model generating random sequence

oh I think LoRA is not compatible with this: 'cause model have to get a chance to learn 'how to use long term memory' but if you initiate with LoRA,...

Model generating random sequence

Oh your loss seems pretty high. If I were you, I could wait until loss ~4. LM train loss >4 is typically closer to the random than fluent generation.

Limitations of the method

Since infini attention uses segmentation which is mainly focused on reducing memory usage and computational cost into O(N) so if you use very long seq such as seq len =...