Nick Hill

Results 114 comments of Nick Hill

@youkaichao another reason the above approach might be better - IIUC the `get_example_metadata_list` approach won't work if the size varies much at runtime (not sure whether that might be the...

@youkaichao I've opened #4844 to show the idea, PTAL!

@aurickq curious how this relates to https://github.com/vllm-project/vllm/pull/3729?

@MLHafizur this might indicate that the MM and/or adapter containers restarted, could you check whether that's the case?

@lizzzcai though it is now the default, cluster scope operation should be considered relatively alpha and still needs a bit more work. In particular w.r.t. how the secrets are handled...

> I see, the controller keeps watching the etcd by [design](https://github.com/kserve/modelmesh-serving/tree/main/docs/architecture#architecture-overview). Yes, this is read-only however, just used to trigger a predictor reconciliation when things change. Otherwise, the etcd data...

Also encountered this when upgrading from 0.7.6 to 0.7.7, with BLOOM 176B.

@RezaYazdaniAminabadi I can confirm that version 0.8.0 fixed the issue for me.

@RezaYazdaniAminabadi apologies I spoke too soon... it's now working for BLOOM 175B with the pre-sharded fp16 weights, but not the original `.bin` checkpoint shards (which do work with 0.7.6). We...

I'd also been thinking about this recently. I think it would be nice to have some kind of `skip_detokenization` or `include_text=False` option in the sampling params.