BMInf icon indicating copy to clipboard operation
BMInf copied to clipboard

Efficient Inference for Big Models

Results 16 BMInf issues
Sort by recently updated
recently updated
newest added

ERROR in app: Exception on /api/fillblank [POST] Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 2070, in wsgi_app response = self.full_dispatch_request() File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1515, in full_dispatch_request rv = self.handle_user_exception(e)...

I was reading the documents and the technical paper, seems like the experiment are done in single Node. Does BMInf support to multiple nodes inference deployment for large model like...

File "/home/wenxuan/lihaijie_files/cpm-live/examples/tune_cpm_ant.py", line 56, in delta_model.freeze_module(exclude=["deltas"], set_state_dict=True) File "/home/wenxuan/miniconda3/envs/lhj/lib/python3.9/site-packages/opendelta/basemodel.py", line 274, in freeze_module self._freeze_module_recursive(module, exclude, "") # modify the active state dict that still need grad File "/home/wenxuan/miniconda3/envs/lhj/lib/python3.9/site-packages/opendelta/basemodel.py", line 316,...

**Is your feature request related to a problem? Please describe.** There are other speedup methods for transformers like [FasterTransformer](https://github.com/NVIDIA/FasterTransformer). **Describe the solution you'd like** Can you describe how your method...

想了解下 对Live模型的加载与体验

question