Felipe Sens Bonetto

Results 10 comments of Felipe Sens Bonetto

@medvedev1088, @ayush3298, can you give it a look, please =)

@medvedev1088 can you give it a look?

Here is how I implemented an easy fallback strategy: ``` geocoded_by :full_address, lookup: lambda{ |obj| obj.geocoder_lookup } def geocoder_lookup if Geocoder.search(full_address, lookup: :nominatim).present? :nominatim else :falllback_geocoder (:google, :esri, etc..) end...

Incorporating elements similar to SORA into this architecture should be feasible: https://openai.com/research/video-generation-models-as-world-simulators. This would involve adding the time dimension to the patches. But probably incorporating text prompts would also be...

Ok, I've figured out that I need to make the layouts of sQ and sK static for now if I want to use gemm(sQ, sK, ...).

It is! Not to the performance I wish it had, but it definitely is compiling. Thanks for the Cutlass class @ericauld ! Any tips on how to speed up this...

It's still far from Cublas. I'm working on getting the proper thread partitions before advancing on the other parts necessary for flash attention. ``` attention_query_key_kernel2 | Best time 54.376431 ms...

This is at 65% of the speed of Cublas `cublasSgemmStridedBatched`

@ngc92 This is because of the variable types we are using right? Or do we need to turn on a flag explicitly? My plan today is to make the shapes...

What is the speed of the matmul_tri compared with cublas?