Ekaterina Aidova
Ekaterina Aidova
@yeonbok thank you for your PR, but I'm not sure that it is right place to do that: 1. device can be changed in runtime with model.to() method, while you...
> BTW, apart from this change, I'd like to ask a question about the logits (output) precision. Currently it is set as fp32, so if the context size is very...
> There are some entities like InferRequestWrapper which are not publicly exposed. I've tried to keep import for those as is for now. We use it in some notebooks and...
> Introduce a CalibrationDataset class representing Dict[str, nncf.Dataset]. I think it makes sense to register dummy object for it in https://github.com/huggingface/optimum-intel/blob/main/optimum/intel/utils/dummy_openvino_and_nncf_objects.py and import structure here https://github.com/huggingface/optimum-intel/blob/main/optimum/intel/__init__.py#L76-L98 it is needed for...
@amit1rrr I think your sggestion maybe too noisy... in case, if error happens there is stack trace with exact stacktrace including cell content. The problematic part for debugging is kernel...
> > The problematic part for debugging is kernel died (that raised outside notebook and does not have any logging in some cases) > > Can you please share an...
@p-wysocki please make sure that updasted IR works on all devices including NPU and downgrade transformation where op is not implemented applied. We still see issue for maxpool (during 2...
@Oneul-hyeon currently optimum-intel does not support inference sLM on NPU, but there is another solution that allow to do that and working on the same optimum-intel converted models, please check...
> The generation parameters included in the model's config will be moved to the generation_config in #902 which should fix this issue @echarlaix thanks for confirmation and better solution! Could...
@peterchen-intel no, I do not see any risks for that, reason of addition clone is not applicable for us, but it makes uncomparable perf with original optimum and llm bench...