Daniel Stokes comments

Results 15 comments of


                                            Daniel Stokes

Enabled BLIF buffer elimination in preparation for InOuts

@kmurray This changes Odin II's BLIF output slightly which seems to have had some QoR changes for VPR. The issues on Travis appear to be improvements. What do I need...

Enabled BLIF buffer elimination in preparation for InOuts

Running the regression tests (with -create_golden to regenerate golden) for basic, strong and nightly I get this error: ``` vtr_func_formal: k6_frac_N10_40nm.xml/stereovision3.v/common file : abc.lec.out failed: Couldn't determine Logical Equivalence status...

Mixtral engine build gives CUDA OOM on 8 40GB GPUs (0.8.0 release)

@vnkc1 can you try the solution of switching device map from auto to cpu as suggested here https://github.com/NVIDIA/TensorRT-LLM/issues/1440

Mixtral engine build gives CUDA OOM on 8 40GB GPUs (0.8.0 release)

I'm not sure I understand how that prevents you loading on the CPU? If you are quantizing to FP8 (Hopper only) you should be using quantize.py. If you are quantizing...

Supporting aggregator elements

Sure thing I opened PR #120

thrust::uniform_int_distribution<uint64_t> exclusively produces multiples of 4096

>For the record the code above above produces 16bits/4 zeros (not 12bits, 3 zeros) at the end because the 48bit engine (ranlux48). Even if the bug wasn't there the 4th...

[FEA]: Add a new shuffle iterator to thrust

I am happy to implement this myself, but I would appreciate some design guidance on what the most suitable API would be. I couldn't find anything already in thrust with...

[FEA]: Add a new shuffle iterator to thrust

Hi @miscco, Actually this function does have a constant cost, each element is entirely independent and can be computed in any order. The RNG is only invoked once on construction...

[FEA]: Add a new shuffle iterator to thrust

I haven't done the formal analysis, in the worst case one call to `iterate_until_in_range` can take O(n) time. However, for each iteration it can be roughly modelled as a (worst...

[FEA]: Add a new shuffle iterator to thrust

To elaborate on how this works, feistel_bijection works on a power of two. To generalize this to a non power of two, we round up to a power of two...