MagicSource
                                            MagicSource
                                        
                                    Thats very good, how does the quality compares with Cambricon?
When will the Chinese weights open?
these two guys looked but didn't response.
thanks, since more and more MLLM used siglip as vision encoder, if flash attention support would be very reduce training cost time
@glenn-jocher why the speed doesn't change at all after prune? Is that only remove the weight of conv but not changed the structure actually? how to save the pruned model...
@glenn-jocher Looka like prune has a `remove` method which can remove weights: ``` prune.remove(module, 'weight') ``` and all weights and params saved in module.state_dict which can be used for new...
@glenn-jocher Nice. do u figure out how to obtain the pruned model architecture?
@glenn-jocher so the simplified model can not get it's new channel num and shape automatically, is there anyway to make it happen?