Open-Assistant Allow users to specify prompt topics they're interested in

Nobody knows everything, and having to continually skip prompts about things one isn't interested in can get tiring. At least two people (one, two) have suggested this on Discord, with a third generally complaining about having to skip so often ("my feedback is "I am not an expert on this topic" for 99% of the skips that I do").

Methods I can think of:

preset categories, to be set by either the prompt creator, another contributor , or a classification model
- :+1: simple to implement
- :-1: inflexible
a zero-shot classifier (preferably multilingual), users specify categories
- :+1: flexible, decently accurate (good enough for at least sorting by match probability)
- :-1: would need to be run for every single prompt-category pair (even the smallest model I could find can only do ~40 pairs per second on HuggingFace servers)
something similar to Lbl2Vec
- :+1: each prompt only needs to be calculated once, then it's just cosine similarity
- :-1: generating the data point for a "label" requires averaging documents; would need to have the user pick prompts they're interested in instead of just writing a word
a model that maps both prompts and single-word labels to a roughly equivalent latent space, similar to CLIP
- :+1: best of both worlds
- :-1: would likely have to be custom-made

Jan 29 '23 23:01 hecko-yes

May I suggest an opposite approach:

At first, query the user for everything
Train a model that will recognize which topics are NOT GOOD for that particular user

This way, we will get users to provide information on topics that they didn't even know that they were familiar with.

It might even spark them to become interested in these topics — profit for both parties.

Jan 30 '23 12:01 jerzydziewierz

@jerzydziewierz My approach is to choose to skip all topics that I can't provide a high quality answer for. Would the flipped model still work in my case?

Jan 30 '23 15:01 mashdragon

@jerzydziewierz My approach is to choose to skip all topics that I can't provide a high quality answer for. Would the flipped model still work in my case?

Yes, very much so -- it would simply learn about the kinds of topics that you skip.

So at first, it will give you a query, but if you refuse to answer, it will lower the chance that you will get a similar query in the future.

The rationale is:

It is not feasible to provide a comprehensive list of topics for you to give a positive selection capability. New topics might come in over time.
Even if at first you are presented with a list of topics, the topic name necessarily does not fully reflect the kinds of questions from the inside of that topic that you will get queried about.
Lower the barrier of entry for new data providers.

Jan 30 '23 15:01 jerzydziewierz

Here is an other possible solution: Partly pre created/partly crowd source Hierarchical/tree topics.

Manually generate the top hierarchical topics, like Programming → Javascript or Science → Biology , Then Crowd source sub topics, example like Biology → DNA or JavaScript → nodeJS . (Some more sub levels of topics should probably be manually generated to guild the crowd sourced topics tree structure, but also an admin/mod should be able to move topics/correct the hierarchy after the fact when needed while a new topic branch is populated) Users that creates initial prompts can suggest new or use existing sub topics at prompt creation. If prompter is unsure of the specific field the prompt is about, they can still use a parent topic like Science. Because the topics are connected hierarchically, any science expert in any field could still get the task and direct it to the right expert, even if they are not an expert in the specific prompt topic, they probably knows the correct sub topic to tag the prompt with.

Expertise sub topics could be added by users in profile page. Could even be part of registration process. A classifying model could at a later stage use the previous data to suggest sub topics to prompters automatically.

Feb 06 '23 22:02 paal85

We need simple, fast, good enough fix because efficiency is very bad the way it is now, I was thinking about ability to write entire conversation by one person and ability to write prompts/replies/conversations based on existing ones (like for example correcting bad replies), this just could be two additional tasks, it should be straightforward and relatively easy to implement, it doesn't exactly solve this problem but would speed things up quite a lot I think, unfortunatelly I personally have no clue about web related stuff so I can't help to implement this.

Feb 06 '23 22:02 xvel

We need simple, fast, good enough fix because efficiency is very bad the way it is now, I was thinking about ability to write entire conversation by one person and ability to write prompts/replies/conversations based on existing ones (like for example correcting bad replies), this just could be two additional tasks, it should be straightforward and relatively easy to implement, it doesn't exactly solve this problem but would speed things up quite a lot I think, unfortunatelly I personally have no clue about web related stuff so I can't help to implement this.

Fastest solution is probably to add topic tags, one prompt can have several tags. And users can subscribe to tags they are interested in covering. Tags can be created by user on the fly if not existing. Also could add an ability to answer own prompts that uses subscribed tags. I need to fill my portfolio anyway while applying to jobs. Could have it up within 2-3 days, need to get familiar with code base. Could later turn tags hierarchical, to allow users to subscribe to all tags within a branch without needing to subscribe to every new tag they may not know about or is created later. So on second though, may just be better make it hierarchical from the start. Should be able to make that within a week.

Feb 06 '23 23:02 paal85

Manually generate the top hierarchical topics, like Programming → Javascript or Science → Biology

How about a pre-existing classification system? My first idea was a library classification system, but the only free one I could find was that of the Library of Congress, which seems to have no machine-readable format. (Edit: There's also the Free Decimal Correspondence, but that in turn isn't very modern.) WikiProjects, maybe?

Feb 07 '23 01:02 hecko-yes

Manually generate the top hierarchical topics, like Programming → Javascript or Science → Biology

How about a pre-existing classification system? My first idea was a library classification system, but the only free one I could find was that of the Library of Congress, which seems to have no machine-readable format. (Edit: There's also the Free Decimal Correspondence, but that in turn isn't very modern.) WikiProjects, maybe?

WikiProjects looks useful. I would favour a simple approach here, the more specific we make the categories the harder it will be to get them correct, and the more likely nobody selects some categories and therefore messages get ignored. I am happy to do the backend work required to support message categories if we can decide on a suitable list of categories.

Feb 07 '23 17:02 olliestanley

I have created #1313 #1314 #1316 to track the backend tasks which will be required for this. Once they have been completed we will need frontend tasks also. @Sobsz you have put thought into this so would be happy to go with your recommendation in terms of what list of categories we use and how we apply them

Feb 07 '23 17:02 olliestanley

Here is an other possible solution: Partly pre created/partly crowd source Hierarchical/tree topics.

Manually generate the top hierarchical topics, like Programming → Javascript or Science → Biology , Then Crowd source sub topics, example like Biology → DNA or JavaScript → nodeJS . (Some more sub levels of topics should probably be manually generated to guild the crowd sourced topics tree structure, but also an admin/mod should be able to move topics/correct the hierarchy after the fact when needed while a new topic branch is populated) Users that creates initial prompts can suggest new or use existing sub topics at prompt creation. If prompter is unsure of the specific field the prompt is about, they can still use a parent topic like Science. Because the topics are connected hierarchically, any science expert in any field could still get the task and direct it to the right expert, even if they are not an expert in the specific prompt topic, they probably knows the correct sub topic to tag the prompt with.

Expertise sub topics could be added by users in profile page. Could even be part of registration process. A classifying model could at a later stage use the previous data to suggest sub topics to prompters automatically.

I think this is a good idea, but requires a lot of user engagement to get set up. I've never liked one-time get-to-know the user systems because a lot of work ends up going to a choose your topics page that never gets updated even as personal preferences (or in this case proficiencies) change. I think something more flexible that recommends based on prior action is a leaner path to similar results.

Feb 07 '23 17:02 newnativeabq

I like the lbl2vec. There's no need for direct user input if it's put into a two-tiered recommender system. You just recalculate the user's base vector (preferences) as some aggregation (componentwise mean of vectors submitted). This can be done pretty lean but needs a search tree. I would not suggest implementing and building search trees on the fly...it's a hassle. Some great vector databases popping up.

Much easier, if tags are going to be used anyway, then every prompt/response will have a cluster of tags, call that [T]. If a prompt has an id ID, then a data system can supply ID-[T] easily. [T] can be loaded into elastic search or something like Vespa and when looking for new prompts, the recommendation can be done on the fly at scale. Again, user input not required. Just store the set of tags and frequencies the tags occur in completed responses to search. This would be significantly easier to maintain.

Given all existing prompts, the system is cold started by loading id-[T] into the text search database and user(s) are initialized with an empty history H0 = [].
As a prompt/response is created, it's added to the database.
Tags can be applied to the prompt/response at any time and id-[T] is updated.
When users come to look for work, their history [H] is computed if not stored and a search S is build S=s(H).
The search is then done and results are parsed by the frontend.

Feb 07 '23 17:02 newnativeabq

I like the lbl2vec. There's no need for direct user input if it's put into a two-tiered recommender system. You just recalculate the user's base vector (preferences) as some aggregation (componentwise mean of vectors submitted). This can be done pretty lean but needs a search tree. I would not suggest implementing and building search trees on the fly...it's a hassle. Some great vector databases popping up.

Much easier, if tags are going to be used anyway, then every prompt/response will have a cluster of tags, call that [T]. If a prompt has an id ID, then a data system can supply ID-[T] easily. [T] can be loaded into elastic search or something like Vespa and when looking for new prompts, the recommendation can be done on the fly at scale. Again, user input not required. Just store the set of tags and frequencies the tags occur in completed responses to search. This would be significantly easier to maintain.

Given all existing prompts, the system is cold started by loading id-[T] into the text search database and user(s) are initialized with an empty history H0 = [].

As a prompt/response is created, it's added to the database.

Tags can be applied to the prompt/response at any time and id-[T] is updated.

When users come to look for work, their history [H] is computed if not stored and a search S is build S=s(H).

The search is then done and results are parsed by the frontend.

This would of course be great to have, but I feel like it would surely be a lot of engineering work (and potentially backend compute/resources) compared to a simple text classifier with some predefined categories and allowing users to select their preferred categories, no?

Feb 07 '23 17:02 olliestanley

Here is an other possible solution: Partly pre created/partly crowd source Hierarchical/tree topics. Manually generate the top hierarchical topics, like Programming → Javascript or Science → Biology , Then Crowd source sub topics, example like Biology → DNA or JavaScript → nodeJS . (Some more sub levels of topics should probably be manually generated to guild the crowd sourced topics tree structure, but also an admin/mod should be able to move topics/correct the hierarchy after the fact when needed while a new topic branch is populated) Users that creates initial prompts can suggest new or use existing sub topics at prompt creation. If prompter is unsure of the specific field the prompt is about, they can still use a parent topic like Science. Because the topics are connected hierarchically, any science expert in any field could still get the task and direct it to the right expert, even if they are not an expert in the specific prompt topic, they probably knows the correct sub topic to tag the prompt with. Expertise sub topics could be added by users in profile page. Could even be part of registration process. A classifying model could at a later stage use the previous data to suggest sub topics to prompters automatically.

I think this is a good idea, but requires a lot of user engagement to get set up. I've never liked one-time get-to-know the user systems because a lot of work ends up going to a choose your topics page that never gets updated even as personal preferences (or in this case proficiencies) change. I think something more flexible that recommends based on prior action is a leaner path to similar results.

Thanks, I can see that is an possibility. But there is ways to counter it, example counter stale topic list, by letting users easily one click add remove topics as they come across them while doing tasks. But no matter how it is solved, I think the user should have the ultimate control to add or removes expertise topics even if they are added automatically, as they will know better what topics they prefer then any trained model could ever do.

Feb 07 '23 18:02 paal85

It depends on where the system(s) are now, I think. The underlying data model and CRUD operations on it (data_id: [categories]) are different. If the user database is all SQL and people would rather it stay that way, there isn't much point in using a document database and copying the text over, trying to maintain that. If the categories aren't stored anywhere yet, then implementing a document model to store users, data object ids, and their categories or other information is not bad. I dunno. That's a big decision. Where these systems start to differ big time, I think, is in the search and ranking portions.

To search and rank based on pre-selected categories, the system ends up a mishmash of SQL and rules in my experience:

1. An over-returning query (this has to be carefully engineered to scale or result quality diminishes over time)
SELECT * FROM prompts WHERE category IN (SELECT category FROM user WHERE user_id = {current_user.id})

2. A ranking algorithm (ranking becomes a huge challenge if done by rules when the rules have to be applied over large collections)
in python: ranks = [(prompt.id, len(set(user.categories).intersect(prompt.categories)) for prompt in returned_prompts]

3. A sort/filter/return statement (need to keep the input to this step 'small'.  don't want 100k prompts going through this every time someone refreshes their dashboard)
ordered_return = []
for prompt in sorted(lambda x: x[1], ranks):
    if len(ordered_return) < max_return_length:
        ordered_return.append(prompt)

Getting that to work well at scale isn't fun. Even though it seems like a lot, vector database and/or document-search systems (like vector, elastic) handle a lot right out of the box, though setup isn't as well documented.

1. Get user categories (self-selected or count-based learned over time or something)
SELECT category FROM user WHERE user_id = {current_user.id}

2. Request N nearest matches (or indexed neighbors is using a embedding/tree approach)
// Fire off the request to prompt collection/database (all in one service now)
    request = $.ajax({
        url: "/prompt/recommend/",
        type: "post",
        data: {top_categories: [user-categories], num_items: 20}
    })

The difference is in the search/ranking implementation and that's where I think the timesaver and engineering saver from simple text search comes in.

Feb 07 '23 20:02 newnativeabq

I think that you might be seriously overthinking it. The proper solution might be to simply expose the user to the list of recent prompts (for instance, 100 per page), and let the user decide.

The benefits of such solution are (1) the simplicity to implement and (2) giving the contributor the full control.

The disadvantage is some extra time needed to scan available prompts. However, this time is marginal compared to one needed to write a proper reply. On the other hand, there might be some satisfaction in the ability to choose the most relevant or interesting topic. I would guess that a contributor would be the best judge of what s/he wants to answer to, compared to a ML system.

Feb 07 '23 21:02 4cb42b

Maybe. I do have a tendency to deliberate. I'm not a frontend developer, though. To me, handling complex UI state and interactions is more difficult than nearest neighbor searches.

Recent entries (recent prompts) aren't bad in my experience. Timestamp indices and partitioning are pretty good everywhere. As for contributor control, I'm not sure. I think it's not quite right to say someone has control over how they interact/what they see when their only option is to see 100 or 1000 items sorted by the most recent. In my mind, as soon as I try to sort by anything meaningful (user-controlled ranking), this structure stops being nice to work on and the UI becomes more complicated. I could see this being more tractable if all prompts were scored on a few well understood categories and then the user can sort on them. I'm not sure allowing user-defined categories will scale in practice and UI design may be hard. Dunno. Not a frontend person.

Feb 07 '23 22:02 newnativeabq

Hmm. Re-reading the intro, the goal seems to be to reduce skips.

What makes a user skip?

Feb 07 '23 22:02 newnativeabq

Hmm. Re-reading the intro, the goal seems to be to reduce skips.

What makes a user skip?

Users tend to tell us they skip because they don't know the subject area (e.g. non-coders getting prompts which ask them to write code). So the idea would mostly be opting out of subjects you don't want, to avoid having to skip. Essentially this is a user quality of life feature, although maybe it would also improve data quality if it causes people to opt-out of topics instead of copying and pasting the first answer they found on Google.

My concern with a large, flexible solution like the vector DB approach is that this is likely only useful on a time-limited basis (once we train the first assistant model most user contribution will be labelling and ranking AI-gen answers, not writing their own ones). It also won't have to be scaled up (much) beyond where it is now.

Feb 07 '23 22:02 olliestanley

That changes things. I didn't know if the training was going to be an ever effort with ever greater scale or if it'd be deprecated after tuning.

If that's the case, I'd also vote against anything requiring large UI updates. That would include a user-category selection system and accompanying backend componentry. It's about the same work all told.

Feb 07 '23 23:02 newnativeabq

To make the UI changes as minimal as possible, maybe the tag list could be next to the up/down vote buttons? Would allow people to tag untagged posts. And for the repliers, there could be a setting such as "Only give me prompts with the following tags:".

Feb 23 '23 15:02 CheckMC