Open-Assistant
Open-Assistant copied to clipboard
Repository should include a document clearly detailing what can and cannot be generated with any models or tools, built or derived from it.
The license listed here is Apache 2.0.
In clarification and, for the avoidance of any doubt, the read-me and associated documentation, should indicate if mature, explicit or NSFW content can (or cannot) be generated with the model/toolset, provided that the content (or generation thereof) does not constitute a breach of appropriate and relevant legal or regulatory requirements in a given users jurisdiction or region. (You might also add applicable community standards here, but those can vary quite considerably.)
As well as the above ideally, the read-me (or a separate ethical generation and use policy document) should indicate if certain sensitive areas are allowed or disallowed.
Some sample areas of potential concern follow (this is not an exhaustive list.):- *Content which contains overt political or ideological content, or which is intended to inform/influence the views or choices of a potential (competent) reader, on issues of public concern, or in an election. (Examples being campaign material, lobbying briefings or public service announcement "fillers".) *The use of fictionalized representations of potentially identifiable individuals (living or deceased), corporations (both current and defunct) and prominent brands , franchises or trademarks associated with those individuals or corporations. *Content which contains LGBTQI themes, including cross-dressing or explorations of non-binary and gender-fluid presentation. *Content which whilst not containing (explicit) deceptions of actual sexual activity, may explore alternative sexuality, fetishes, or practices of a mutually consensual nature, between informed consenting adult participants. (You've already made it quite clear elsewhere that you do not want Open Assistant to be used to generate illegal obscenity.) *Use of profanity and pejoratives. (in an appropriate context) *Deceptions of violence, crime, 'abuse' or self-harm. (in line with the editorial standards typically applied in print or other media.) *Professional advice which would typically be made a qualified individual under regulatory supervision (such as Doctors, attorneys, financial advisers, architects and engineers, )
I know that this may seem to be overly cautious, but it would seem reasonable to have some kind of guidance document, beyond the typical "Do not do illegal things with this.." common in most open-source projects. Especially given that LLM's are getting media attention.
The ToS contains this:
The user may only use the portal for the intended purposes. In particular, he/she may not misuse the portal. The user undertakes to refrain from generating text that violate criminal law, youth protection regulations or the applicable laws of the following countries: Federal Republic of Germany, United States of America (USA), Great Britain, user's place of residence. In particular it is prohibited to enter texts that lead to the creation of pornographic, violence-glorifying or paedosexual content and/or content that violates the personal rights of third parties. LAION reserves the right to file a criminal complaint with the competent authorities in the event of violations.
So it should be fairly simple to add a "Permitted/Prohibited" uses document to the repository then, detailing for the areas of concern mentioned what is and is not permitted (the list however is not exhaustive as stated, and omits some situations that would already be prohibited by the TOS wording.), Otherwise some users will wrongly assume they can do anything they are legally permitted to do in their jurisdiction, subject only to the terms of the Apache license, which is clearly NOT my reading of what the TOS for the portal you have just quoted is.
I can fully understand and fully support a prohibition on the generation of illegal, obscene or violence-glorifying material. If however, there isn't something in the repository that makes it abundantly clear that the models and tools have such usage restrictions, then potential re-users and implementers could be confused as to what are and importantly what are not permitted uses (including applications), confusions are where misunderstandings and mistakes arise. If it's going to be a locked down model/toolset, re-users need to know that, so that they can make more informed choices.
So it should be fairly simple to add a "Permitted/Prohibited" uses document to the repository then, detailing for the areas of concern mentioned what is and is not permitted ( the list is not as stated exhaustive), otherwise some users will wrongly assume they can do anything they are legally permitted to do in their jurisdiction, subject only to the terms of the Apache license, which is clearly NOT my reading of what the TOS for the portal you have just quoted is.
I can fully understand and fully support a prohibition on the generation of illegal, obscene or violence-glorifying material, but if there isn't something in the repository that makes it abundantly clear that the models and tools have such usage restrictions, then potential re-users and implementers could be confused as to what are and importantly what are not permitted uses, confusions are where misunderstandings arise. If it's going to be a locked down model/toolset, re-users need to know that, so that they can pick appropriate tools for their use case.
To be clear, that TOS applies to usage of the OA website, and users must agree to it to use the site. No restrictions beyond the Apache 2.0 License are or will be imposed by OA on the use of the OA code/models outside of our site.
Thank you for and I appreciate the clarification. However some kind of document about "Ethical use" would of course still be appreciated. It also signals to responsible re-users and implementers that concerns about potential misuse are being taken seriously.
I would also strongly suggest a revision of the TOS (for the portal) in the future to include an expanded definition of what is being considering as Pornography (as opposed to outright obscenity which I fully support as being a prohibited use.), and violence-glorifying material, as different jurisdictions may apply slightly different standards. If in respect of the former they mean any kind of erotic or potentially "arousing" material at all, then the TOS should say that directly.
Thank you for and I appreciate the clarification. However some kind of document about "Ethical use" would of course still be appreciated. It also signals to responsible re-users and implementers that you are taking concerns about potential misuse seriously.
I would also strongly suggest revising the TOS (for the portal) in the future to include an expanded definition of what you are considering as Pornography (as opposed to outright obscenity which I fully support as being a prohibited use.), and violence-glorifying material, as different jurisdictions may apply slightly different standards. If in respect of the former you mean any kind of erotic or potentially "arousing" material at all, then say that directly.
Ok, let's leave this issue open to get views from community members on what people may want to see included in an ethical use document!
I was not involved in TOS creation so can't comment really, if someone from the team who was involved sees this they may be able to go into further detail.
I'm minded to close/withdraw this isssue based on comments in https://github.com/Stability-AI/StableLM/issues/3 specfically that "Community Guideline and similar ethical concern documents are actively harmful in open source."
Like @olliestanley said, the license is the license. Any ethical document/guide/recommendations doesn't have a place here.
No restrictions beyond the Apache 2.0 License are or will be imposed by OA on the use of the OA code/models outside of our site.
Good!
I think this can/should be closed.
Anybody, regardless of their (lack of) affiliation with Open Assistant, is free to publish any statement on what other people should do, on their own personal blog.
Since this is an open source project, I really don't see the point for having such a statement in the repo itself. Every person can have an individual stance that can change over time, there is no need to have one statement all participants have to agree with.