grok-1 icon indicating copy to clipboard operation
grok-1 copied to clipboard

Optimize Error Handling and Regex Caching in Tensor Loading

Open Madhav-MKNC opened this issue 11 months ago • 5 comments

This PR introduces two key enhancements to the tensor loading process: (Fixes #220)

  • Improved error handling within ThreadPoolExecutor to provide detailed logs for failures during parallel tensor loading.
  • Implementation of caching for regex operations in get_load_path_str to reduce computational overhead and improve loading efficiency.

These changes aim to enhance the robustness and performance of tensor loading, particularly in distributed computing environments.

Madhav-MKNC avatar Mar 19 '24 16:03 Madhav-MKNC

Hey! I'm happy to see a limit set for the memory usage in LRU Cache. Would it be possible for you to roll a test for this change? I think a pytest test would suffice.

Aareon avatar Mar 22 '24 23:03 Aareon

Perhaps path_tuple_to_string can also utilize the LRU cache, as well. Just a thought.

Aareon avatar Mar 22 '24 23:03 Aareon

After delving a bit. I now think 32MB in LRU cache may be too much overkill. I don't think more than 250kb at most should be necessary.

Aareon avatar Mar 27 '24 04:03 Aareon

"delving" hahahaha (for reference: https://x.com/JeremyNguyenPhD/status/1774021645709295840)

RaphaelFakhri avatar Mar 30 '24 20:03 RaphaelFakhri

"delving" hahahaha (for reference: https://x.com/JeremyNguyenPhD/status/1774021645709295840)

Hadn't seen this, and the stat doesn't really correspond with anything related to ChatGPT. I most certainly didn't need to use ChatGPT to come to that conclusion.

I recommend trying both ways and running benchmarks to see which provides the best performance improvement.

Aareon avatar Apr 05 '24 22:04 Aareon