Patrick Horn
Patrick Horn
@John-Nagle Regarding skeletons, we are further along thanks to work done in the past few years by other communities, V-Tubers and the recent push for social VR tech. A lot...
Hey, First, I appreciate the interest in this project, and it's pretty cool that you have been following the work since the new design. The big deal with this rewrite...
In case it helps, we attempted to isolate the time when `is_cancelled_` is set to true. When hitting ctrl-C while streaming, evhtp seems to hit the `htp__connection_eventcb_` callback with event...
I just made a PR #2244 - feel free to test and give feedback I only tested on Apple Silicon (M3) but the original code from #1028 was designed for...
Please do not reply with only a basic "I'm interested" or "please" without contributing any new information. To show support, react to the main post with a thumbs up 👍...
@aikitoria I found that it was far easier and performant to implement in decodingCommon.cu since the same math used for logprobs can be used for calculating the relative threshold used...
Here is some example BLS code for adding the min_p value into the bad_words list in the way this PR expects: ```py numpy_tensor = preproc_output_tensor.as_numpy() if trtllm_tensor_name == "bad_words_list": bad_words_data,...
It looks like `scaled_mm_blockwise_sm100_fp8.cu` and `scaled_mm_blockwise_sm100_fp8_dispatch.cuh` were committed only as symlinks. When the change is done and ready for review, make sure to double check the actual files are added....