Nicolas Patry comments

Results 978 comments of


                                            Nicolas Patry

Cannot deploy Falcon-40B-instruct server because of low fixed timeout on startup

Did you properly set ` --shm-size 1g ` ?

Cannot deploy Falcon-40B-instruct server because of low fixed timeout on startup

Not really 60s for cross GPU communication is really A LOT. Here allowing for a longer timeout will not help, since the cards just cannot communicate.

Add option to specify model datatype

Created a PR for it.

Support for mosaicml/mpt-30b-instruct model

Take example to other models we have done in `server/text-generation-server/models/custom_modeling/*.py` maybe ? There's also some files in `server/text-generation-server/models/*.py`. Those are declaring the model as being flash enabled (the batching happens...

Support for mosaicml/mpt-30b-instruct model

It's supported on the "best effort basis". I started some work to actually support it, but it means rewriting flash attention (the cuda version) with added bias, which may take...

Support for mosaicml/mpt-30b-instruct model

> on implementing dynamic batching for this as it only supports 1 concurrent request for now on AutoModel. This won't require work once we have flash attention.

Support for mosaicml/mpt-30b-instruct model

Because it doesn't implement the flash attention we want. This is Triton's flash attention, which doesn't support "unpadded" batching, which is the one necessary to work nicely on TGI (removing...

Support for mosaicml/mpt-30b-instruct model

Here is the non flash version (as a temporary measure since modifying the kernel is taking more time than I anticipated: https://github.com/huggingface/text-generation-inference/pull/514 This should enable sharding at least.

Sagemaker entrypoint HF_MODEL_TRUST_REMOTE_CODE in config doesn't get recognized

https://github.com/huggingface/text-generation-inference/pull/514 should make requiring TRUST_REMOTE_CODE not necessary anymore.

Sagemaker entrypoint HF_MODEL_TRUST_REMOTE_CODE in config doesn't get recognized

I will close this issue since it seems to be solved. For `tiiuae/falcon-rw-1b` feel free to open an issue with the env and stacktrace so we can look into fixing...