KG_RAG icon indicating copy to clipboard operation
KG_RAG copied to clipboard

run time

Open DayanaYuan opened this issue 1 year ago • 7 comments

It's been running all night with no results def fetch_GPT_response(instruction, system_prompt, chat_model_id, chat_deployment_id, temperature=0): print('Calling OpenAI...') print("1.4\n") response = openai.ChatCompletion.create( temperature=temperature, deployment_id=chat_deployment_id, model=chat_model_id, messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": instruction} ] ) print("1.5\n") if 'choices' in response
and isinstance(response['choices'], list)
and len(response) >= 0
and 'message' in response['choices'][0]
and 'content' in response['choices'][0]['message']: return response['choices'][0]['message']['content'] else: return 'Unexpected response'

after print("1.4\n"), the operation is stuck. No result

DayanaYuan avatar Feb 26 '24 13:02 DayanaYuan

Hi @yangyangyang-github This is a typical openai API call. If the function is going in an endless loop, can you please double check your openai credentials that was provided in the config.yaml file? Also, we have already put guardrails in the api call function to stop making calls after some sufficient time using retry module. Please refer here. Hence, if you are using the same functionality, it should not go to an endless loop. Let me know how things turn out for you.

karthiksoman avatar Feb 26 '24 18:02 karthiksoman

Hi @yangyangyang-github This is a typical openai API call. If the function is going in an endless loop, can you please double check your openai credentials that was provided in the config.yaml file? Also, we have already put guardrails in the api call function to stop making calls after some sufficient time using retry module. Please refer here. Hence, if you are using the same functionality, it should not go to an endless loop. Let me know how things turn out for you.

I have added openai.api_key parameters to the file. What exactly is the openai credentials that was provided in the config.yaml file? May I have a look, please?

DayanaYuan avatar Feb 27 '24 02:02 DayanaYuan

You should have a file named '.gpt_config.env' and store it in your $HOME path. Content of the file should be in the following format:

API_KEY='openai api key' API_VERSION='this is optional' RESOURCE_ENDPOINT='this is optional'

karthiksoman avatar Feb 28 '24 05:02 karthiksoman

Isn't it convenient for you to release this file?

------------------ 原始邮件 ------------------ 发件人: "BaranziniLab/KG_RAG" @.>; 发送时间: 2024年2月28日(星期三) 中午1:38 @.>; @.@.>; 主题: Re: [BaranziniLab/KG_RAG] run time (Issue #18)

You should have a file named '.gpt_config.env' and store it in your $HOME path. Content of the file should be in the following format:

API_KEY= API_VERSION= RESOURCE_ENDPOINT=

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

DayanaYuan avatar Feb 28 '24 05:02 DayanaYuan

The file contains API credentials, which, like any other sensitive information, should ideally not be shared publicly. Hope you get it :) Feel free to reach out if you need further assistance!

karthiksoman avatar Feb 28 '24 05:02 karthiksoman

For Llama coda, when run to llm = llama_model(MODEL_NAME, BRANCH_NAME, CACHE_DIR, stream=True, method=METHOD), model = AutoModelForCausalLM.from_pretrained(model_name, device_map='auto', torch_dtype=torch.float16, revision=branch_name, cache_dir=cache_dir), the program simply exits.

------------------ 原始邮件 ------------------ 发件人: "BaranziniLab/KG_RAG" @.>; 发送时间: 2024年2月28日(星期三) 中午1:54 @.>; @.@.>; 主题: Re: [BaranziniLab/KG_RAG] run time (Issue #18)

The file contains API credentials, which, like any other sensitive information, should ideally not be shared publicly. Hope you get it :) Feel free to reach out if you need further assistance!

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

DayanaYuan avatar Mar 01 '24 02:03 DayanaYuan

@yangyangyang-github Did you check if this is a memory issue? We are not using quantized versions of llama here, hence it could take a good chunk of memory. If you see here, you can see the size of the tensors for llama-13b and compare it with the memory of the machine that you are using.

I tried using llama-13b on p3.8x.large AWS instance which has following specs: 4 Tesla V100 GPU 64 GB GPU memory 32 vCPU 244 GB RAM

karthiksoman avatar Mar 01 '24 04:03 karthiksoman