Jonathan

Results 32 comments of Jonathan

I think that's a good evolution. We can refactor the doc at some point if needed! I want to split up what you've written into a non-technical (about the project,...

Some forensic analysis follows. This issue happens in our call to `decode_latents()` but is actually an issue with diffusers' `attention.py`'s `AttentionBlock` `forward()` method: ``` attention_scores = torch.baddbmm( torch.empty( query_proj.shape[0], query_proj.shape[1],...

The problems with tiled VAE as I see them are: - Inconsistency between tiles (if we didn't care about that, we could just use embiggen) - Requirement of xformers (and...

I tested and ran out of VRAM on `softmax` even with separating this out into `bmm` and multiply. Everything gets converted to fp32.

> looks good and i won't make this a blocker but i was confused/surprised that the term `h_symmetry_point` was being used to describe a point in _time_ rather than an...

I'm all in favor of increasing quality of commit messages but not a huge fan of enforcing a particular style of commit message through a pre-commit hook. I do like...

I believe this is the same issue as #2672. After image generation is complete, you need a lot of memory for a brief period to decode the underlying representation into...

Putting some thoughts and testing results here in this PR. With some brief testing, you get that performance boost but also non-deterministic behavior. None of the options available (subject to...

> Do we need to micromanage each possible implementation like that, or is it sufficient to use `torch.backends.cudnn.deteriministic = True`? https://pytorch.org/docs/master/notes/randomness.html#cuda-convolution-determinism Maybe we can get away with that or `torch.use_deterministic_algorithms(True)`....

Here's what I get: ``` RuntimeError: Deterministic behavior was enabled with either `torch.use_deterministic_algorithms(True)` or `at::Context::setDeterministicAlgorithms(true)`, but this operation is not deterministic because it uses CuBLAS and you have CUDA >=...