runhouse icon indicating copy to clipboard operation
runhouse copied to clipboard

PX (P90) for inference Cold start

Open tshrjn opened this issue 8 months ago • 1 comments

Describe the bug Please provide a clear and concise expectation of how cold start looks like. I see the docs mentions couple of methods ot speed up the load time for models, it would be great if objective numbers could be added. Ray also provides methods to combat cold start, and I see the library is being utilized, but do you use such methods?

For example if you look the img below from this article, most providers of the cold starts are below 100s. (see img) & most providers list either P90/P70/P50 values to help understand the cold start problem & solutions in those terms.

Other relevant stuff: https://news.ycombinator.com/item?id=35738072 https://www.banana.dev/blog/turboboot

tshrjn avatar Oct 29 '23 21:10 tshrjn