ragas icon indicating copy to clipboard operation
ragas copied to clipboard

langchain.generate

Open zhzfight opened this issue 2 years ago • 4 comments

code here seems to be not reasonable:

def generate(
        self,
        prompts: list[ChatPromptTemplate],
        n: int = 1,
        temperature: float = 1e-8,
        callbacks: t.Optional[Callbacks] = None,
    ) -> LLMResult:
        # set temperature to 0.2 for multiple completions
        temperature = 0.2 if n > 1 else 1e-8
        if isBedrock(self.llm) and ("model_kwargs" in self.llm.__dict__):
            self.llm.model_kwargs = {"temperature": temperature}
        else:
            self.llm.temperature = temperature

        if self.llm_supports_completions(self.llm):
            return self._generate_multiple_completions(prompts, n, callbacks)
        else:  # call generate_completions n times to mimic multiple completions
            list_llmresults = run_async_tasks(
                [self.generate_completions(prompts, callbacks) for _ in range(n)]
            )

            # fill results as if the LLM supported multiple completions
            generations = []
            for i in range(len(prompts)):
                completions = []
                for result in list_llmresults:
                    completions.append(result.generations[i][0])
                generations.append(completions)

            llm_output = _compute_token_usage_langchain(list_llmresults)
            return LLMResult(generations=generations, llm_output=llm_output)

i run the evaluate on cpu and memory, and encounter an error:

here 3
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
GGML_ASSERT: /tmp/pip-install-dnbwilnk/llama-cpp-python_8aba6af1128d49b6bada62c4fe0fd870/vendor/llama.cpp/ggml.c:15149: cgraph->nodes[cgraph->n_nodes - 1] == tensor
GGML_ASSERT: /tmp/pip-install-dnbwilnk/llama-cpp-python_8aba6af1128d49b6bada62c4fe0fd870/vendor/llama.cpp/ggml.c:4039: ggml_can_mul_mat(a, b)
GGML_ASSERT: /tmp/pip-install-dnbwilnk/llama-cpp-python_8aba6af1128d49b6bada62c4fe0fd870/vendor/llama.cpp/ggml-alloc.c:453: view->view_src != NULL && view->view_src->data != NULL
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No stack.
The program is not being run.
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No stack.
The program is not being run.
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No stack.
The program is not being run.
Segmentation fault (core dumped)

i check the n is 3. so may this code be causing it:

list_llmresults = run_async_tasks(
                [self.generate_completions(prompts, callbacks) for _ in range(n)]
            )

can you help me, thanks!

zhzfight avatar Dec 05 '23 07:12 zhzfight

which LLM are you using? if you are using an Opensource model or embedding could you share how your initiating that and the details

jjmachan avatar Dec 05 '23 12:12 jjmachan

i initial my model use the following code: from langchain.llms import LlamaCpp model = LlamaCpp( model_path="model/mistral-7b-instruct-v0.1.gguf", temperature=0.70, max_tokens=2000, n_ctx=4096, top_p=1, verbose=True, ) and my package version is: langchain==0.0.340 ragas==0.0.20

zhzfight avatar Dec 06 '23 01:12 zhzfight

I see, this bug is due to LlamaCpp and async not working properly due to multithreading issues.

jjmachan avatar Dec 07 '23 16:12 jjmachan

I see, this bug is due to LlamaCpp and async not working properly due to multithreading issues.

So, is there a workaround. How to solve this?

Davo00 avatar Mar 01 '24 22:03 Davo00