Peter
Peter
failed CI seemed unrelated
@mcabbott Could you help review this PR?
> Any idea if Mamba2 model can be supported by Transformers.jl or not? Depends on whether we have the required operators. > can I ask you about your policy when...
`/test/huggingface/load.jl:34` is testing if there are parameters that exist but are not found in the `state_dict` thus randomly initialized. You can set `ENV["JULIA_DEBUG"] = Transformers` before calling `load_model` in the...
@CarloLucibello Personally the device functionality changes should be considered as breaking. See also https://github.com/FluxML/Flux.jl/issues/2513
For the FluxAdaptor types, yes, but the whole device functionality in Flux would affect many aspects, not to mention MLDataDevice.jl is a different implementation that might introduce inconsistency, but that’s...