Ondřej Čertík

Results 365 issues of Ondřej Čertík

On my machine these changes speedup inference from 0.789s to 0.602s.

This gets to 0.594s, but it's not as readable as before, so I am going to keep it as a Draft for now, since the ideas are good, but ultimately...

And change the instructions / workflow to simply download it. That way we eliminate the need to use Python at all, and things become more robust. One would only use...

* Add a version into `model.dat` * Increment the version with every change in the format (in both the Python writer and Fortran reader) * In the reader, check the...

Using the 1558M model and the following input: ``` python encode_input.py \ "Alan Turing theorized that computers would one day become very powerful, but even he could not imagine" \...

Investigate what the best way to parallelize is across nodes using MPI or coarrays.

Currently the attention over heads runs in serial: https://github.com/certik/fastGPT/blob/01eb84b015d89a567245da0445c0abb7d53a8500/gpt2.f90#L101 We should try to parallelize it and see if we get any speedups.

I will use this issue to document our progress. October 22: * Summary: https://fortran-lang.discourse.group/t/generics-proposal-for-202y-video-call/353/7

@tclune, @rouson, @FortranFan, @mleair, Magne and myself discussed the various paths forward. It seems the available options are: * templates without concepts (e.g., traditional C++ templates) * templates with concepts...