Open-Assistant icon indicating copy to clipboard operation
Open-Assistant copied to clipboard

Rework Guidelines documentation

Open horribleCodes opened this issue 2 years ago • 25 comments

As of right now, there's no place where you can find a concise summary of the guidelines. Ideally, every task should be clearly defined, with a list of points that are critical to create high quality data. If there's something that can be left to the interpretation to the person submitting a prompt or reply, this should also be made clear. I strongly recommend that these lists should be either embedded or easily accessible to users for their respective tasks, as I've mentioned in #1320.

These changes are by no means exhaustive, or final. I also have to stress that I'm not a lawyer and could very well be missing or misinterpreting something. Feel free to suggest any additional changes to make things more clear.

horribleCodes avatar Feb 08 '23 14:02 horribleCodes

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] avatar Feb 08 '23 14:02 github-actions[bot]

Should we adhere to a specific format for text in general, so for example Markdown?

Markdown seems good, I was using it so far. I think simple bullet points of dos and don'ts gives readers a quick overview in case they want to make sure whether something is allowed.

  • Do inform the prompter if the assistant is making assumptions not explicitly specified in the prompt. For example, assuming the prompter's goal or level of experience.

I think that's a good idea, but I'd like to see if we can't refine this a bit further. It seems really inconvenient if the assistant prefaced everything with "I am assuming you are referring to this specific goal, and you are a novice about this subject.". I guess an ideal personal assistant would gain an understanding of your level of expertise and adjust its replies accordingly, but until then, I think it's not necessary to specify this stuff. I see it this way:

  • Keep it simple, and avoid jargon unless it has been used by either party.
  • Adjust if the user asks for a more complex or simplified explanation.
  • If it's not possible to determine a request from a prompt, or if the reply to one interpretation would run counter to another interpretation, ask for clarification.

I think this would avoid any confusion without interfering with the natural flow of the conversation.

horribleCodes avatar Feb 09 '23 09:02 horribleCodes

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] avatar Feb 09 '23 16:02 github-actions[bot]

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] avatar Feb 10 '23 08:02 github-actions[bot]

Should I also ask assistant replies to include Markdown-style formatting? I assume it will be enabled for the actual model, so we probably should take advantage of that.

Likewise, I've seen many different styles of formatting: some just write simple paragraphs, some preface each paragraph with a title, and some of these are separated from the paragraph... It probably will cause the actual output to have very inconsistent formatting, which might confuse the user. Should we agree on a uniform system? Or is there a way to finetune the model to use a consistent method of formatting?

horribleCodes avatar Feb 10 '23 09:02 horribleCodes

I am in favor of instructing people to use markdown formatting (but I would be fine with some other standard as well).

JohannesGaessler avatar Feb 10 '23 10:02 JohannesGaessler

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] avatar Feb 10 '23 10:02 github-actions[bot]

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] avatar Feb 10 '23 11:02 github-actions[bot]

@horribleCodes you would need to get the docs site up and running locally to see how this PR would look.

See installation and local dev in here: https://github.com/LAION-AI/Open-Assistant/blob/main/docs/README.md

If changing and removing files you would also just need to reflect the new structure here: https://github.com/LAION-AI/Open-Assistant/blob/main/docs/sidebars.js

I can help if you have any issues getting a local docs site up and running so that you can see how the changes of the PR will end up looking on the site (at moment like this i think it would crash as sidebar.js is looking for prompting.md but this PR deletes it).

So just a small bit of docusraus stuff and then can mostly just refactor and have the docs as md files - just need to tell docusrausu which ones to use and where to render them on the docs site is all.

andrewm4894 avatar Feb 10 '23 11:02 andrewm4894

I majorly reworked the document - first, since the guidelines weren't just limited to writing prompts, I renamed it to Guidelines. I also moved the examples to another document since I felt it hurt readability and titles could help people find what they're looking for.

I also completed the dos and don'ts for each task and added an explanation for each label. Again, I'd like to stress that I might very well be wrong with some points and gladly amend anything if it makes sense.

@horribleCodes you would need to get the docs site up and running locally to see how this PR would look.

See installation and local dev in here: https://github.com/LAION-AI/Open-Assistant/blob/main/docs/README.md

If changing and removing files you would also just need to reflect the new structure here: https://github.com/LAION-AI/Open-Assistant/blob/main/docs/sidebars.js

I can help if you have any issues getting a local docs site up and running so that you can see how the changes of the PR will end up looking on the site (at moment like this i think it would crash as sidebar.js is looking for prompting.md but this PR deletes it).

So just a small bit of docusraus stuff and then can mostly just refactor and have the docs as md files - just need to tell docusrausu which ones to use and where to render them on the docs site is all.

Sure, I can do that - I'm not at my desktop just yet, so I haven't had a chance to run it locally since I've forked the project.

horribleCodes avatar Feb 10 '23 11:02 horribleCodes

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] avatar Feb 10 '23 11:02 github-actions[bot]

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] avatar Feb 10 '23 11:02 github-actions[bot]

A suggestion: we should maybe tell users not to demand insane amounts of work as the prompter. This only encourages poor-quality answers, most likely ChatGPT spam. Example:

https://open-assistant.io/messages/efdcfb29-d1e0-47a9-80fd-9e157f60250e

JohannesGaessler avatar Feb 10 '23 14:02 JohannesGaessler

A suggestion: we should maybe tell users not to demand insane amounts of work as the prompter. This only encourages poor-quality answers, most likely ChatGPT spam. Example:

https://open-assistant.io/messages/efdcfb29-d1e0-47a9-80fd-9e157f60250e

I don't think that's a problem. In this case, the initial prompt specifically asked for a step-by-step guide on the topic, and the assistant provided just that. If we prohibit thorough instructions that require multiple steps and a lot of effort to perform, the model won't be able to give provide them to the user, either. The person playing the user is required to submit something that builds on the conversation, not to do everything the assistant tells them to.

In this particular case, a message like "How difficult would it be to include multi-core processing?" or "How can I set up a test environment for my OS?" would be a perfectly serviceable continuation of the conversation - it further delves into the topic, and doesn't require the user to try out anything the assistant suggested.

horribleCodes avatar Feb 10 '23 16:02 horribleCodes

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] avatar Feb 11 '23 09:02 github-actions[bot]

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] avatar Feb 11 '23 09:02 github-actions[bot]

Not really sure why the pre-commit check is failing. I also tried running the site on a codespace, but I keep getting a 502 error. If anyone has any advice, I'm all ears.

horribleCodes avatar Feb 11 '23 11:02 horribleCodes

Docs site usually easy enough to get going on laptop.

cd docs yarn install yarn start

This should get to docs site at localhost:3000

andrewm4894 avatar Feb 11 '23 12:02 andrewm4894

I can have a look and try your branch also next time at my laptop.

andrewm4894 avatar Feb 11 '23 12:02 andrewm4894

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] avatar Feb 12 '23 10:02 github-actions[bot]

Got it to work locally, had some issues setting up yarn. The changes look good, the links are working. I also didn't realize pre-commit was its own thing, I assumed it was part of vanilla Git.

horribleCodes avatar Feb 12 '23 10:02 horribleCodes

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] avatar Feb 13 '23 10:02 github-actions[bot]

Suggestion: provide numerical values both as metric and imperial units when appropriate.

JohannesGaessler avatar Feb 14 '23 11:02 JohannesGaessler

Suggestion: provide numerical values both as metric and imperial units when appropriate.

I like that idea, though personally I think allowing prompt engineering to declare a preferred system would make the most sense. Given this goal, would it still be better to include both?

horribleCodes avatar Feb 14 '23 14:02 horribleCodes

Lots of good work in here - i think maybe we should try see how we can get it merged and then add any further improvements as follow on PR's perhaps.

andrewm4894 avatar Feb 15 '23 11:02 andrewm4894