Kevin Stephano

Results 30 comments of Kevin Stephano

The first issue is there isn't any fusion. The `add` and the `index_select` are executed as separate kernels so there isn't any difference with eager mode and, given the segmentation,...

I don't think this is an nvFuser issue. The nvFuser standalone repro does not fail. I wonder if it was just the place that the CUDA error first got caught....

I am not sure what exactly is desired but we could add a `kwarg` to nvFuser's `FusionDefinition::execute()` method such as `make_repro` such as `FusionDefinition.execute(inputs, make_repro=True)`. The issue is that we...

This sounds good! This might already be in mind with these bullet points but it would be helpful to show explicitly how to: * Show the trace * Execute specific...

Optimizer's inplace updates might be of concern when designing a scheme as Adam already carries 3 copies of model state algorithmically.

> triage review — > > * filed [nvfuser failure #514](https://github.com/Lightning-AI/lightning-thunder/issues/514) for the nvFuser part of this issue (let's track that separately) > * @kiya00 would you take a look...

This is potentially a little more weird. The error is actually suggesting that the reshape `shape` is composed of something other than python integers or nvFuser Scalars suggesting that the...

I need to add something like the following to report the type in pybind11: ```C++ void check_type(py::handle obj) { py::handle type = py::type::handle_of(obj); if (!type.is_none()) { std::string type_name = static_cast(py::str(type));...

Can we measure whether this is a wallclock time (CPU overhead) or a kernel time regression? An NSight profile would show the comparison.

This isn't a question of nvFuser of increasing compilation time, it is of what is proper to do for aliasing. nvFuser would not be increasing compilation due to segmentation if...