Bryan Lozano
Bryan Lozano
Seeing the same issue, Ubuntu python 2.7.x Tried installing both using pip, and by cloning from git, and running the script
@FSSRepo First off, thanks for contributing this example. Just want to include you on this issue to discuss this. Do you recall why you picked 1024 for this overhead? Can...
I was gdb'ing last night, and I saw that when building the graph, memory is allocated from the context's memory pool for the output tensor. It happened somewhere under ggml_mul_mat()....
Tangentially, I also wanted to profile the matrix multiplication. I put a loop and timers around this line: https://github.com/ggerganov/ggml/blob/bb8d8cff851b2de6fde4904be492d39458837e1a/examples/simple/simple-ctx.cpp#L66 1000 iterations. Again, I see the context running out of memory....