Pedro Cuenca
Pedro Cuenca
> Amazing work huggingface team ❤️! > > Here are mine - > > ### 14" MacBook M1 Pro - 14 GPU cores / 6 performance cores - All settings...
Nice computer @Tz-H! We were very interested to see performance on M2 Max, thanks a lot!
Hi @grapefroot! Initially I was under the impression that RAM would be an important factor for performance (it is on iOS), but in our tests we did not notice any...
@Zabriskije the results in our table were done thus: - `ORIGINAL` attention when using compute units `CPU_AND_GPU`. - `SPLIT_EINSUM` attention for `CPU_AND_ANE`.
@mja – Super interesting, thanks a lot!
@Zabriskije We wanted the blog post to be easy, so we decided to hide some details. But yeah, maybe it's worth pointing it out :) Barring bugs, the way the...
Data point on an Intel Mac: iMac Retina 5K, 2020 Processor: 3.6 GHz 10-Core Intel Core i9 GPU: AMD Radeon Pro 5700 XT 16 GB Model: stable-diffusion-2-base Guidance Scale: 7.5...
Thanks for the comments @sacmehta! Hub repositories do not support arbitrary hierarchy. Similar to GitHub, they are structured as a namespace (`corenet-community` in this case), and then a flat list...
Hi @Arian-Akbari! Probably related to the fact that llama3 uses two stop ids in conversational mode: https://github.com/meta-llama/llama3/blob/0cee08ec68f4cfc0c89fe4a9366d82679aaa2a66/llama/tokenizer.py#L91-L94 Your inference software should stop generation once any of them are encountered :)
Hello! The example shows a regular generation prompt, you can add a negative prompt using `negative_prompt` as an additional argument in the call to `pipe()`: ```py image = pipe("Green pokemon...