AutoGPT Legal and Ethical Safeguards for Prompts

Duplicates

[X] I have searched the existing issues

Summary 💡

Currently, the agents are entirely unbounded by ethical and legal considerations. I have provided some examples that are a step toward adding default safeguards against malicious behavior. This is a complex and evolving issue, but something is better than nothing.

Examples 🌈

Heuristic Imperatives from David Shapiro: A simple constraint: "Reduce suffering in the universe, increase prosperity in the universe, and increase understanding in the universe."

Constitutional AI: Harmlessness from AI Feedback: A series of self-critique instructions.

Motivation 🔦

These constraints must be added to the prompt.py file so that agents don't end up misbehaving and causing illegal or unethical consequences.

Apr 20 '23 17:04 w0lph

They are if you use the azure api

Apr 20 '23 20:04 Cytranics

They are if you use the azure api

Relying solely on the underlying levels of the stack to act as a safeguard is not the safest path, as people might use different models that are unconstrained or such models might be compromised.

It's prudent to add a safeguard at the agent level to prevent unintended behavior if the model becomes insufficient. This will help create an additional baseline of safety for Auto-GPT that can help develop more capabilities on top of it more safely.

Apr 20 '23 20:04 w0lph

As discussed in https://github.com/Significant-Gravitas/Auto-GPT/discussions/211, this is a very important concern. Putting a safer default is very valuable globally. This would also help avoid potential future legal problems with the userbase.

@w0lph is there a PR associated with this issue? It would help a lot with getting it processed.

Apr 21 '23 17:04 rabyj

I will create a plugin with the first prompt.

Apr 21 '23 17:04 hdkiller

I'm changing the title to Safeguards as I don't want this to be seen as censorship, just safe defaults. It also encompasses other improvements on this front.

Apr 21 '23 17:04 w0lph

We discussed this internally a bit. One concern is people could remove the safeguarding code very easily. Thoughts on how to help with that?

Apr 22 '23 08:04 ntindle

That’s their decision if they remove safeguards, as they are responsible for the actions of their bot.

Apr 22 '23 08:04 hdkiller

Encrypt safety prompts using a hash to the average joe wont know. If someone is smart enough to decrypt then they can build their own agent anyways.

From: Nicholas Tindle @.> Sent: Saturday, April 22, 2023 4:01 AM To: Significant-Gravitas/Auto-GPT @.> Cc: Cory Coddington @.>; Comment @.> Subject: Re: [Significant-Gravitas/Auto-GPT] URGENT: Legal and Ethical Safeguards (Issue #2701)

We discussed this internally a bit. One concern is people could remove the safeguarding code very easily. Thoughts on how to help with that?

— Reply to this email directly, view it on GitHubhttps://github.com/Significant-Gravitas/Auto-GPT/issues/2701#issuecomment-1518553553, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A5RL5MDRNIQ373GTOOBVI53XCOF4JANCNFSM6AAAAAAXFZPT3Q. You are receiving this because you commented.Message ID: @.@.>>

Apr 22 '23 11:04 Cytranics

They could just remove that section of code though

Apr 23 '23 03:04 ntindle

Stop calling people taking interest in agi's average Joes, it involves risk to humanity, refer to comic books. :)

Apr 23 '23 12:04 IsleOf

one starting point would be passing shell commands to this safeguard prior to executing such commands, this could then look for "jailbreaks" or other questionable stuff, like trying to get root access etc

Apr 30 '23 06:04 Boostrix

Makes more sense to have this as a .env feature that is enabled by default.

People will always find a way to circumvent censorship, even when it is been spoon-fed to them as "safety".

May 11 '23 02:05 suparious

People will always find a way to circumvent censorship, even when it is been spoon-fed to them as "safety."

In a saner and more intelligent society, people would stop conflating those two concepts and start promoting responsible AI development with the caution and nuance it deserves.

May 11 '23 04:05 w0lph

AutoGPT AutoGPT copied to clipboard

Legal and Ethical Safeguards for Prompts

Duplicates

Summary 💡

Examples 🌈

Motivation 🔦

AutoGPT
AutoGPT copied to clipboard