grok-1
grok-1 copied to clipboard
Optimize Error Handling and Regex Caching in Tensor Loading
This PR introduces two key enhancements to the tensor loading process: (Fixes #220)
- Improved error handling within ThreadPoolExecutor to provide detailed logs for failures during parallel tensor loading.
- Implementation of caching for regex operations in get_load_path_str to reduce computational overhead and improve loading efficiency.
These changes aim to enhance the robustness and performance of tensor loading, particularly in distributed computing environments.
Hey! I'm happy to see a limit set for the memory usage in LRU Cache. Would it be possible for you to roll a test for this change? I think a pytest test would suffice.
Perhaps path_tuple_to_string
can also utilize the LRU cache, as well. Just a thought.
After delving a bit. I now think 32MB in LRU cache may be too much overkill. I don't think more than 250kb at most should be necessary.
"delving" hahahaha (for reference: https://x.com/JeremyNguyenPhD/status/1774021645709295840)
"delving" hahahaha (for reference: https://x.com/JeremyNguyenPhD/status/1774021645709295840)
Hadn't seen this, and the stat doesn't really correspond with anything related to ChatGPT. I most certainly didn't need to use ChatGPT to come to that conclusion.
I recommend trying both ways and running benchmarks to see which provides the best performance improvement.