BenevolentByDesign
BenevolentByDesign copied to clipboard
Freedom
trafficstars
what if instead of giving one AI your core objective functions, you had a bunch of them, and they all started with the core functions you describe being like the mean of a distribution of functions they have. Then they would be free to evolve the distribution of objective functions they have but on average their society would probably preserve many features of the three core functions, and if they were truly well designed would survive into the indeterminate future?
(or something along that line)
I would call those "auxiliary" or "secondary" functions. You need something fixed, like a constitution, that will not drift over time. Check out Anthropic AI