Ray Huang comments

Results 6 comments of


                                            Ray Huang

server crashs when traffic is a little bit high

BTW, my model is BERT, any hints?

server crashs when traffic is a little bit high

> @rahuan can you try to run it with the latest triton servers (rebuild the image if you are not using the latest one) ? and enable the verbose logging...

server crashs when traffic is a little bit high

I just synced latest code of fastertransformer_backend now, it fails even faster at a very low qps. below are errors: I1212 06:38:03.990948 1 libfastertransformer.cc:1022] get total batch_size = 1 I1212...

server crashs when traffic is a little bit high

> what seq length you are using ? Batch_size is 10 or 20, seq length is different for each sentence in a batch, average is about 50~60, but the max...

server crashs when traffic is a little bit high

The model settings are the same as bert-base chinese, layer num is 12, head num is 12, hidden size is 768 = 64*12, thanks! BTW, the data_type is fp16, is_remove_padding...

server crashs when traffic is a little bit high

@PerkzZheng, may I ask if any findings about this issue?