leiwen83 comments

Results 39 comments of


                                            leiwen83

add Conv singlestage

@hwu36 Hi, Could you help review this patch, whether this singlestage patch could be merged? Thx, Lei

add Conv singlestage

> @leiwen83 can you provide performance results to justify this PR? Seem to me there is no performance gain for this single stage conv being added. So maybe we could...

V1.0.4 also has problem,, but it would meet error after several op test. ``` star:/data/cnn # ./mace_cc_benchmark Benchmark Time(ns) Iterations Input(MB/s) GMACPS -------------------------------------------------------------------------------------------------------------- MACE_BM_ADDN_2_1_128_128_32_float_CPU 1321442 720 3174.04 0.00 MACE_BM_ADDN_2_1_256_256_32_float_CPU 5026605...

operation is not registered

> @leiwen83 Could you use the master branch? we now work on the master, thank you first master branch cannot even run. The first comment is tested over master branch...

Low RTX20 tensor core performance

Hi @daadaada , I see current triton is using inline ptx assembly for the codegen. As your previous work on gas and turingas prove that sass level is more effective,...

[RFC][WIP] Use llama-3 instead of llama-2 for basic testing

Could we switch to some other repo like https://huggingface.co/nvidia/Llama3-ChatQA-1.5-8B? Meta-llama is a gated repo, I think many people may have no permission to get access, including me...

[Bug fix][Core] fixup ngram not setup correctly

> Why was it not covered by existing tests? It is for current ngram still use draft model set as target model to get some info like vocab size. In...

add spec infer related into prometheus metrics.

cc @cadedaniel

add spec infer related into prometheus metrics.

> will take a look Monday. btw, how is this different from system efficiency metric? (boost ratio == num_spec_tokens+1 * system efficiency?) the new boost_ratio would express more accurate expression...

add spec infer related into prometheus metrics.

@cadedaniel @robertgshaw2-neuralmagic Any comment for the latest PR change? :)