label-studio-ml-backend I want to create a RLHF backend/frontend for labelling<=>training<=>correcting error loop.

If anyone has any lead on this please let me know. also anyone want to collaborate on this direction please let me know.

Mar 27 '23 18:03 hemangjoshi37a

Have you checked active learning?

https://docs.heartex.com/guide/active_learning.html

https://www.youtube.com/watch?v=8EO4vOw1MZc

Mar 29 '23 03:03 makseq

@makseq While active learning is good but RLHF is quite different than that becuase it implements Reignforcement Learning for optimization of the model. All in all if you know what is RLHF it is quite different than active learning.

Mar 29 '23 06:03 hemangjoshi37a

Yes, I know, but I expect to see your workflow in LS to achieve it. Seems you need Accept/Reject actions for your annotations? or ranking?

Mar 31 '23 01:03 makseq

Yes the RLHF can be done in multiple ways. You can have yes no type or ranking type.

Mar 31 '23 05:03 hemangjoshi37a

Basically what I propose is the have a generalized RLHF model that goes at the output side of any model and instead of having supervised training we can have unsupervised training that can be supervised by the reinforcement model.

Mar 31 '23 05:03 hemangjoshi37a

Maybe this repo will be helpful for you: https://github.com/heartexlabs/label-studio-RLHF/

Apr 14 '23 01:04 makseq

@makseq maybe it is a private repo. giving me 404 error

Apr 14 '23 04:04 hemangjoshi37a

@hemangjoshi37a Sorry, could you please check this one? https://github.com/heartexlabs/RLHF

Apr 28 '23 01:04 makseq