pywinassistant
pywinassistant copied to clipboard
Is this library compatible with open source llm's such as llama3 - mixtral-large ?
To add compatibility with open source llms
Usage of local LLM's is in the works.
I have tried other LLM's but the assistant results seem to worsen using the same prompting techniques which are intended for GPT-3 and GPT-4. Back a year ago I tried it with Llama 1 but failed too much requiring a total change of the algorithm prompting techniques.
Now I'm trying local LLama3 and Mixtral-Large, benchmarking different prompting techniques intended for those models and developing new agents that generate new prompts to benchmark different models using the same framework. When it's ready I will choose the best secure ones and update the project to be used locally too, but expect it for version 1.0.0 (we're at 0.4.0).
For security reasons I'm not willing to post those agents, but be aware that this is possible.
Could you not just finetune a LLama3 to your style of prompting (If its just a prompting technique)? Like for example Unsloth did in this colab https://colab.research.google.com/drive/135ced7oHytdxu3N2DNe1Z0kqjyYIkDXp?
Also I would like to point out that you could use the Azure GPT versions if you want to consider Security, GDPR and stuff like that. As a European this is a big topic for us because of the new AI Act and our regulations. They should employ the same Models as OpenAI so there shouldn't be much of a difference.
also there is a lot of research going on for prompt engineering techniques maybe some of those can help your project:
Paper name Date Institute author Paper
Self consistency March 22 Google https://arxiv.org/abs/2203.11171.pdf
Generated knowledge Sep 22 Washington university https://arxiv.org/pdf/2110.08387.pdf
Chain of thought Jan 23 Google https://arxiv.org/pdf/2201.11903.pdf
Least to most Apr 23 Google https://arxiv.org/pdf/2205.10625.pdf
Chain of verification Sep 23 Meta https://arxiv.org/pdf/2309.11495.pdf
Skeleton of thought Oct 23 Microsoft https://arxiv.org/pdf/2307.15337.pdf
Step back prompting Oct 23 Google https://arxiv.org/pdf/2310.06117.pdf
Rephrase and Respond Nov 23 UCLA University https://arxiv.org/pdf/2311.04205.pdf
Emotion Stimuli Nov 23 Microsoft https://arxiv.org/pdf/2307.11760.pdf
System 2 attention Nov 23 Meta https://arxiv.org/pdf/2311.11829.pdf
OPRO Dec 23 Google https://arxiv.org/pdf/2309.03409.pdf
I have pirated this list from J. Yarkoni - https://docs.google.com/presentation/d/1fboeXSrRhMBDuNKhs8ctKntTnIE5c4BqUAUt48TAvGE/edit#slide=id.g2c8b4d20382_2_0
best regards Vincent
Usage of local LLM's is in the works.
I have tried other LLM's but the assistant results seem to worsen using the same prompting techniques which are intended for GPT-3 and GPT-4. Back a year ago I tried it with Llama 1 but failed too much requiring a total change of the algorithm prompting techniques.
Now I'm trying local LLama3 and Mixtral-Large, benchmarking different prompting techniques intended for those models and developing new agents that generate new prompts to benchmark different models using the same framework. When it's ready I will choose the best secure ones and update the project to be used locally too, but expect it for version 1.0.0 (we're at 0.4.0).
For security reasons I'm not willing to post those agents, but be aware that this is possible.
@henyckma Would it be possible for you to give us a small guide on how to port to ollama or similar to run it on llama3? I have been trying and keep getting weird errors so a guide from you would be great. No worries if not though, just wanting to try this out without the cost haha.
Usage of local LLM's is in the works. I have tried other LLM's but the assistant results seem to worsen using the same prompting techniques which are intended for GPT-3 and GPT-4. Back a year ago I tried it with Llama 1 but failed too much requiring a total change of the algorithm prompting techniques. Now I'm trying local LLama3 and Mixtral-Large, benchmarking different prompting techniques intended for those models and developing new agents that generate new prompts to benchmark different models using the same framework. When it's ready I will choose the best secure ones and update the project to be used locally too, but expect it for version 1.0.0 (we're at 0.4.0). For security reasons I'm not willing to post those agents, but be aware that this is possible.
@henyckma Would it be possible for you to give us a small guide on how to port to ollama or similar to run it on llama3? I have been trying and keep getting weird errors so a guide from you would be great. No worries if not though, just wanting to try this out without the cost haha.
One vital point here as mentioned is that changing the model to something less than for instance mixtral will lead to issues with the prompts used, however as this is not my project, nor do i hope im stepping on any toes; but could you just not modify the core_api.py file to something like this?
import requests
import json
# Assuming you've set up your local LLM to accept POST requests at this endpoint
LOCAL_LLM_ENDPOINT = "http://localhost:11434/api/chat"
def api_call(messages, model_name="YOUR_OLLAMA_MODEL_NAME", temperature=0.5, max_tokens=150):
# Prepare the messages payload for your local LLM
payload = {
"model": model_name,
"messages": messages,
"stream": True # Assuming your local model also supports streaming responses
}
try:
# Send POST request to your local LLM endpoint
response = requests.post(LOCAL_LLM_ENDPOINT, json=payload)
response.raise_for_status()
# Stream the response and concatenate the content until the completion message
output = ""
for line in response.iter_lines():
if line:
body = json.loads(line)
if "error" in body:
raise Exception(body["error"])
if not body.get("done", False):
content = body.get("message", {}).get("content", "")
output += content
else:
break
decision = output.strip() if output else None
return decision
except Exception as e:
raise Exception(f"An error occurred while calling local LLM: {e}")
# Example usage
# Replace this payload with the actual messages sequence for your use case
messages_payload = [
{"role": "system", "content": "You are a helpful and knowledgeable assistant."},
{"role": "user", "content": "Please help me troubleshoot my JavaScript code."}
]
result = api_call(messages_payload, temperature=0.7, max_tokens=100)
print(f"AI Analysis Result: '{result}'")
Could you not just finetune a LLama3 to your style of prompting (If its just a prompting technique)? Like for example Unsloth did in this colab https://colab.research.google.com/drive/135ced7oHytdxu3N2DNe1Z0kqjyYIkDXp?
Also I would like to point out that you could use the Azure GPT versions if you want to consider Security, GDPR and stuff like that. As a European this is a big topic for us because of the new AI Act and our regulations. They should employ the same Models as OpenAI so there shouldn't be much of a difference.
also there is a lot of research going on for prompt engineering techniques maybe some of those can help your project:
Paper name Date Institute author Paper Self consistency March 22 Google https://arxiv.org/abs/2203.11171.pdf Generated knowledge Sep 22 Washington university https://arxiv.org/pdf/2110.08387.pdf Chain of thought Jan 23 Google https://arxiv.org/pdf/2201.11903.pdf Least to most Apr 23 Google https://arxiv.org/pdf/2205.10625.pdf Chain of verification Sep 23 Meta https://arxiv.org/pdf/2309.11495.pdf Skeleton of thought Oct 23 Microsoft https://arxiv.org/pdf/2307.15337.pdf Step back prompting Oct 23 Google https://arxiv.org/pdf/2310.06117.pdf Rephrase and Respond Nov 23 UCLA University https://arxiv.org/pdf/2311.04205.pdf Emotion Stimuli Nov 23 Microsoft https://arxiv.org/pdf/2307.11760.pdf System 2 attention Nov 23 Meta https://arxiv.org/pdf/2311.11829.pdf OPRO Dec 23 Google https://arxiv.org/pdf/2309.03409.pdf
I have pirated this list from J. Yarkoni - https://docs.google.com/presentation/d/1fboeXSrRhMBDuNKhs8ctKntTnIE5c4BqUAUt48TAvGE/edit#slide=id.g2c8b4d20382_2_0
best regards
Vincent
Hi @Razorbob Vicent,
I'm aware of several AI regulations in the UE and outside. For now the project complies with the federal AI Standards Coordination Working Group, Asilomar AI Principles and IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems.
Before those regulations even existed, I was truly investigating and aware of the implications for what should I release to the wilderness and what not, so I deleted some parts of the code and hardcoded them in natural language instead of using pure agent inference.
Thank you for the information as it is too relevant for this project, I'm studying them all. You have great proposals.
Regarding to the agents that learns and refine prompts for Windows, releasing them to the wild (within this project) wrongly development can lead to security concerns and will not comply with federal regulations as bad actors can make bad usage of it, something that I truly don't wish, this is project is purely intended to help people.
Antrophic just released an agent similar: https://x.com/anthropicai/status/1788958483565732213
My private agents are specifically designed for alignment and refinement on Windows to generate better prompting techniques, and instead of uploading those agents, I will choose the best prompts and update the project to use local LLMs with those best prompts. It is going to take some time, but will be better this way for security reasons.
For now, the Single Action Model of this project using Chat-GPT API works too well, improving users life's with disabilities to use Windows OS at a minimum cost.
I'm working with local LLMs aiming to make it available to everyone for fast and free assistance. Using local LLMs highly decreases accuracy compared to the actual Chat-GPT API implementation. This makes me believe that probably the Chat-GPT LLM'S are trained on actual OS screens too.
I'm also working on fine-tunning Llama3 for OS screen knowledge.
Could you not just finetune a LLama3 to your style of prompting (If its just a prompting technique)? Like for example Unsloth did in this colab https://colab.research.google.com/drive/135ced7oHytdxu3N2DNe1Z0kqjyYIkDXp? Also I would like to point out that you could use the Azure GPT versions if you want to consider Security, GDPR and stuff like that. As a European this is a big topic for us because of the new AI Act and our regulations. They should employ the same Models as OpenAI so there shouldn't be much of a difference. also there is a lot of research going on for prompt engineering techniques maybe some of those can help your project:
Paper name Date Institute author Paper Self consistency March 22 Google https://arxiv.org/abs/2203.11171.pdf Generated knowledge Sep 22 Washington university https://arxiv.org/pdf/2110.08387.pdf Chain of thought Jan 23 Google https://arxiv.org/pdf/2201.11903.pdf Least to most Apr 23 Google https://arxiv.org/pdf/2205.10625.pdf Chain of verification Sep 23 Meta https://arxiv.org/pdf/2309.11495.pdf Skeleton of thought Oct 23 Microsoft https://arxiv.org/pdf/2307.15337.pdf Step back prompting Oct 23 Google https://arxiv.org/pdf/2310.06117.pdf Rephrase and Respond Nov 23 UCLA University https://arxiv.org/pdf/2311.04205.pdf Emotion Stimuli Nov 23 Microsoft https://arxiv.org/pdf/2307.11760.pdf System 2 attention Nov 23 Meta https://arxiv.org/pdf/2311.11829.pdf OPRO Dec 23 Google https://arxiv.org/pdf/2309.03409.pdf
I have pirated this list from J. Yarkoni - https://docs.google.com/presentation/d/1fboeXSrRhMBDuNKhs8ctKntTnIE5c4BqUAUt48TAvGE/edit#slide=id.g2c8b4d20382_2_0 best regards Vincent
Hi @Razorbob Vicent,
I'm aware of several AI regulations in the UE and outside. For now the project complies with the federal AI Standards Coordination Working Group, Asilomar AI Principles and IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems.
Before those regulations even existed, I was truly investigating and aware of the implications for what should I release to the wilderness and what not, so I deleted some parts of the code and hardcoded them in natural language instead of using pure agent inference.
Thank you for the information as it is too relevant for this project, I'm studying them all. You have great proposals.
Regarding to the agents that learns and refine prompts for Windows, releasing them to the wild (within this project) wrongly development can lead to security concerns and will not comply with federal regulations as bad actors can make bad usage of it, something that I truly don't wish, this is project is purely intended to help people.
Antrophic just released an agent similar: https://x.com/anthropicai/status/1788958483565732213
My private agents are specifically designed for alignment and refinement on Windows to generate better prompting techniques, and instead of uploading those agents, I will choose the best prompts and update the project to use local LLMs with those best prompts. It is going to take some time, but will be better this way for security reasons.
For now, the Single Action Model of this project using Chat-GPT API works too well, improving users life's with disabilities to use Windows OS at a minimum cost.
I'm working with local LLMs aiming to make it available to everyone for fast and free assistance. Using local LLMs highly decreases accuracy compared to the actual Chat-GPT API implementation. This makes me believe that probably the Chat-GPT LLM'S are trained on actual OS screens too.
I'm also working on fine-tunning Llama3 for OS screen knowledge.
Yeah u are right about releasing the security concerns. I would also go so far as to say, that a poorly developed model could screw up your whole system and Data. For example it deletes files in System32 or sets random registry keys.
Thanks for your feedback - glad I can help. I love the work you did and I think you have a good project going. I try to contribute further, but sadly I have a company to run and can only do so much. I will drop a PR to choose between openAi and Azure as soon as they release my sponsorship credits for the API. Hopefully this is fine for you :). Do you have some kind of roadmap what u wanna do in the next months? Maybe some more people would be willing to contribute from here on.
Open Router uses Litellm to serve the models, if that would be implemented, it would give us the ability, to use all kinds of models Local/Cloud Providers (Openai, Anthropic, Together, Open Router etc.)
..and can i just add that after looking thru the code more thoroughly; you will need to include some sort of model for the image analysis, bakllava would perhaps work in cooperation with the mixtral model for the proper responses.
alot of considerations and prompts if this were to be done, however if there are ethical concerns this ends here :)