jbm

Results 52 comments of jbm

After 50 epochs I'm getting the same thing—roughly 8 seconds of music, then some kind of stasis or short loop: https://github.com/facebookresearch/audiocraft/assets/15166432/f54cf050-a1c8-449f-8fe6-170e10471c7a https://github.com/facebookresearch/audiocraft/assets/15166432/33ec1c1b-00a4-405f-9b35-4d25c907e893 https://github.com/facebookresearch/audiocraft/assets/15166432/c032f308-7fe2-468d-8e9f-9b12a02f6795 This is from my solver (which is...

Yeah, 30 seconds—so the 30 / 4 is my 8-ish (7.5) seconds of good output explained. But I think it's actually that I was inadvertently training from scratch, rather than...

I'm also not clear on what is being indicated with the strings "unprompted_description" and "prompted_description" in the sample generations. Is that explained anywhere? (I don't see anything clear in the...

I hit this problem and could only solve it by disabling GPUs... `os.environ["CUDA_VISIBLE_DEVICES"] = ""`

@aarmstrong78 Did you ever solve this problem? I have the same error message (`Cannot squeeze a dimension whose value is not 1`) from BERT running in CoreML (via ONNX).

Just as an update, the model I was using above was based on the Pytorch-pretrianed-BERT repo, so I tried the huggingface distilBert model, but I get exactly the same error....

Another update on my `Cannot squeeze` error; I'm training BertForMaskedLM, which, looking at the code, shouldn't have any `squeeze` calls in it—BertForQuestionAnswering, however, does call `squeeze`. So I'm guessing that,...

@hollance That would be extremely helpful, thanks! [bert-test-256_FP16.mlmodel.zip](https://github.com/huggingface/swift-coreml-transformers/files/4263226/bert-test-256_FP16.mlmodel.zip) The error for from CoreML (in Xcode) is: ``` 2020-02-27 10:05:54.749888-0800 Spliqs[2968:1145298] [espresso] [Espresso::handle_ex_plan] exception=Espresso exception: "Invalid state": Cannot squeeze a dimension...

Thanks so much, I'll give this a try! Do you understand why the squeeze was there in the first place? Searching `modeling_distilbert.py` only reveals an explicit `.squeeze` call in BertForQuestionAnswering...

Wow, okay. It really is the Wild West! It would be great if Apple would throw a little more of its multi-billion dollar steam behind this process. It's generally pretty...