Vincent Min comments

Results 9 comments of


                                            Vincent Min

Ollama’s speed in generating chat content slowed down by tenfold When switching the chat format to JSON

I don't know the specifics of how Ollama achieves JSON mode, but let me point out that vLLM supports [outlines](https://github.com/outlines-dev/outlines) and [lm-format-enforcer](https://github.com/noamgat/lm-format-enforcer) for guiding generation, [see the vLLM docs](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters-for-chat-api). It...

mxbai-embed-large embedding not consistent with original paper

I can also confirm that Ollama embeddings for `snowflake-arctic-embed:137m-m-long-fp16` are not behaving as expected. I set up a synthetic benchmark for internal testing. I take 500 articles and use an...

Math Domain error in _control.py

We are suffering from the same issue. Indeed the problem occurs stochastically, preventing us from creating a minimal example. We are using scoop together with the DEAP package.

[Feature Request] pre-built Docker image support

An official lm-evaluation-harness Docker image would be great. I just found that BigScience hosts the following Docker image: https://github.com/orgs/bigcode-project/packages/container/package/evaluation-harness

Issue: How to conveniently disable langsmith calls?

For anyone interested, here's a clean solution: ```python from langsmith import tracing_context llm = ChatOpenAI() with tracing_context(enabled=False): # Anything in this code block will **not** be traced to LangSmith llm.invoke("hello...

metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

I encountered this issue too after migration to `langchain-core==0.3.x` and switching from `Pydantic` v1 to v2. It would be great if Ragas could be updated to be compatible.

📚 Documentation: question on copilotRuntimeNextJSAppRouterEndpoint

Thanks for clarifying, Nathan. I was assuming that the LLM was used only to decide which agent should be used to address the current question, i.e. the LLM functions purely...

📚 Documentation: question on copilotRuntimeNextJSAppRouterEndpoint

Yes, that's correct.

How do I add the ChatML prompt template to the .args file when creating a llamafile?

This feature is a must have. I am interested in using Llamafile as an alternative to Ollama, but lack of support for prompt templates is a dealbreaker.