Harsh Verma
Harsh Verma
> Ok I figured what my problem was. I wasn't using the correct branch. Have you guys been using the rel branch? if you actually checkout that branch then build.py...
> I am still facing this issue even after making sure I was on the release branch. I keep getting a: > > `C:\Users\Administrator\AppData\Local\Microsoft\WindowsApps\python.exe: can't open file 'C:\\Users\\Administrator\\TensorRT\\TensorRT-LLM\\build.py': [Errno 2]...
I revised the script so that it can now retrieve the MAL ID based on a specified anime title. (Title means the exact name we get during anime selection)
> Wouldn't it make sense to only skip once ? > > Then you would be able to watch the opening/outro by rewinding, and you would also be able to...
There will be two Lua scripts: skip-once.lua and skip-always.lua. Depending on the user's preference, ani-skip will return the appropriate flag.
> Wow, super active git, ik amazed on how much has been implemented, and how well maintained everything is, not abandoned at all Thanks for your comment! I’ve actually decided...
It's working without any problem but why the generation speed is slow compared non quantized models?
Thanks for the fast response. Do you plan to work on it someday? I can implement it If you can explain flash attention a little bit.
> It's available at this branch:https://github.com/Minami-su/attention_sinks_autogptq @synacktraa Thankyou🙏
@BrianPugh I've been working on a library for automating LLM function/tool calling, where it parses `functions`/`TypedDict`/`NamedTuple`/`Pydantic model` and can invoke them from their raw value (for typeddict, namedtuple and pydantic...