openvino icon indicating copy to clipboard operation
openvino copied to clipboard

[CPU] Enable mmap for model loading from cache.

Open nshchego opened this issue 1 year ago • 3 comments

Details:

  • Use mmap for model compilation from cache.
  • ...

Tickets:

  • Part of the task 127331

nshchego avatar Mar 07 '24 06:03 nshchego

Could you give some test data about how much benefit we can achieve from import model with mmap buffer? such as, how many memory has been saved? And is there any performance impact for inference throughput and first inference latency?

I attached perf numbers to the ticket

nshchego avatar Apr 15 '24 09:04 nshchego

Implementation LGTM. Please, add tests

There are enough number of test cases in the CompileModelCacheTestBase. They cover mmap as well.

nshchego avatar Apr 25 '24 08:04 nshchego

@nshchego , the main question is why cannot we implement std::basic_streambuf over the mapped memory block? If we had such an implementation, we could reuse the most of the existing serialization/deserialization code without changes and without introducing a separate code path, which essentially accesses the buffer directly instead of working through STL stream.

maxnick avatar May 22 '24 16:05 maxnick

This PR will be closed in a week because of 2 weeks of no activity.

github-actions[bot] avatar Jun 07 '24 00:06 github-actions[bot]

This PR will be closed in a week because of 2 weeks of no activity.

github-actions[bot] avatar Jun 22 '24 00:06 github-actions[bot]

This PR will be closed in a week because of 2 weeks of no activity.

github-actions[bot] avatar Sep 03 '24 00:09 github-actions[bot]