Damian Kalinowski
Damian Kalinowski
**Is your feature request related to a problem? Please describe.** In our project we are trying to use Drogon for very long requests (up to 10 minutes long). While it...
**Is your feature request related to a problem? Please describe.** There is an overhead of creating new threads when using streaming response feature. This drogon example demonstrates it very well:...
### 🛠Summary Public official links
### 🛠Summary 0.35 BFCL multiturn unary Changes to original chat template: - introducing reasoning=`none` - this automaticly adds empty reasoning channel so it forces the model to skip the...
It fixes issue with openvino-tokenizer support missing for python 3.9 and below, however export_models.py still fails with: ``` Traceback (most recent call last): File "/usr/local/bin/optimum-cli", line 8, in sys.exit(main()) ^^^^^^...
### 🛠Summary 0.35 BFCL multiturn unary Changes to original chat template: - introducing reasoning=`none` - this automaticly adds empty reasoning channel so it forces the model to skip the...
### 🛠Summary Changes which are not following OpenAI harmony format, but workaround model accuracy issues - tool call appears in wrong channel (analysis), not only in commentary as supposed...
### 🛠Summary .bin files can be put directly to benchmark app which is also capable of testing output accuracy as @liubo-intel suggested
### 🛠Summary CVS-166468