llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

We could use std::unordered_map over std::map

Open Fabio3rs opened this issue 1 year ago • 2 comments

If it is not necessary sorted maps, change std::map to std::unordered_map

std::unordered_map is a hash table so it should be faster than std::map when storing many items.

std::map<id, token> can be a std::vector since the vector index can be equal to the token id

Fabio3rs avatar Mar 19 '23 21:03 Fabio3rs

the token vector should prob be a struct now which also includes the score (see https://github.com/ggerganov/llama.cpp/commit/074bea2eb1f1349a0118239c4152914aecaa1be4)

eiz avatar Mar 20 '23 10:03 eiz

the token vector should prob be a struct now which also includes the score (see 074bea2)

I did a commit merging with the new changes using struct, I am not sure about the names or the organization of the struct

Fabio3rs avatar Mar 20 '23 12:03 Fabio3rs

Are you able to measure the performance gains from these changes? Interested to see how much of an impact they have.

Great work!

z11h avatar Mar 20 '23 22:03 z11h

Are you able to measure the performance gains from these changes? Interested to see how much of an impact they have.

Great work!

Thanks!

Sadly it's hard to do consistent tests, but what I got is: last commit (bd4b46d) result with: ./build/llama -m models/7B/ggml-model-f16.bin --color -p 'tdd is' --seed 0

main: mem per token = 14696644 bytes
main:     load time =  3236.45 ms
main:   sample time =    57.25 ms
main:  predict time = 37547.44 ms / 286.62 ms per token
main:    total time = 41166.49 ms

last commit (bd4b46d) + this pull request: (same command above)

main: mem per token = 14696644 bytes
main:     load time =  3204.60 ms
main:   sample time =    57.64 ms
main:  predict time = 37372.86 ms / 285.29 ms per token
main:    total time = 40956.10 ms

This differences should be more noticeable with larger datasets and more tokens, I am using the model 7B.

Fabio3rs avatar Mar 21 '23 00:03 Fabio3rs

Are you able to measure the performance gains from these changes? Interested to see how much of an impact they have.

there should not really be any. none of the code is particularly hot.

Green-Sky avatar Mar 21 '23 00:03 Green-Sky

Apologies for the conflicts - lets resolve and merge.

Regarding the C++ standard question from the other thread: There are a few reasons and I know there will be people that will disagree and I totally understand. I just think this way it is less incentivising to use overly-complicated constructs. You are correct that in this case we can actually gain some performance with std::string_view and even save unnecessary allocations, but this part of the code is totally negligible compared to the full transformer evaluation. Plus, we can do the suggested change using raw pointers - yes, bit more ugly, but the performance will be there.

Overall, my experience tells me this is the better way - or at least it is better in my views and understandings. And if there ever appears a very good reason to bump the standard - we will do it. But at the moment, there is no good enough reason to do it.

ggerganov avatar Mar 21 '23 17:03 ggerganov

Apologies for the conflicts - lets resolve and merge.

Regarding the C++ standard question from the other thread: There are a few reasons and I know there will be people that will disagree and I totally understand. I just think this way it is less incentivising to use overly-complicated constructs. You are correct that in this case we can actually gain some performance with std::string_view and even save unnecessary allocations, but this part of the code is totally negligible compared to the full transformer evaluation. Plus, we can do the suggested change using raw pointers - yes, bit more ugly, but the performance will be there.

Overall, my experience tells me this is the better way - or at least it is better in my views and understandings. And if there ever appears a very good reason to bump the standard - we will do it. But at the moment, there is no good enough reason to do it.

Thanks!

I think I resolved the conflicts, if there are some problems I am happy to fix.

Fabio3rs avatar Mar 21 '23 17:03 Fabio3rs