YLSnowy
YLSnowy
> As far as I remember, we have a different checkpoint structure from the original Mixtral model. For instance, [we keep every expert in a separate file](https://github.com/dvmazur/mixtral-offloading/blob/ce545188b804238f0b23a59fc45e6a8f8b390c40/src/build_model.py#L148). This should lead...
> As far as I remember, we have a different checkpoint structure from the original Mixtral model. For instance, [we keep every expert in a separate file](https://github.com/dvmazur/mixtral-offloading/blob/ce545188b804238f0b23a59fc45e6a8f8b390c40/src/build_model.py#L148). This should lead...
> > > As far as I remember, we have a different checkpoint structure from the original Mixtral model. For instance, [we keep every expert in a separate file](https://github.com/dvmazur/mixtral-offloading/blob/ce545188b804238f0b23a59fc45e6a8f8b390c40/src/build_model.py#L148). This...
Thank you for your answering. So how to profile for a model ? I haven't seen source code about it.