outlines icon indicating copy to clipboard operation
outlines copied to clipboard

Add multi-label conditional choice generation example

Open davidberenstein1957 opened this issue 2 years ago • 7 comments

I was working on creating a tutorial for adding computational feedback to our data labelling platform and noticed that In some situations, it might be useful to work on multi-label conditional choice generation.

I would love to tackle this in a PR if you feel this would be a nice addition.

davidberenstein1957 avatar Aug 16 '23 07:08 davidberenstein1957

Also, when my tutorial is done, it might be a nice applied example for your website on how to use it to work towards training a model?

davidberenstein1957 avatar Aug 16 '23 07:08 davidberenstein1957

I skimmed to the paper and realized that the regex-like generation will not work, but I played around with the json generation, which proved useful for this usecase. Do you think it makes sense to add an example to your readme via a PR?

Multiple choices (multi-label)

from pydantic import BaseModel

import outlines.models as models
import outlines.text.generate as generate

model = models.transformers("gpt2")

class Topic(BaseModel):
    new_card: bool = False
    mortgage: bool = False
    application: bool = False
    payments: bool = False

sequence = generate.json(model, Topic)("I want to a new card bank card at my bank")
# {
#   "new_card": true,
#   "mortgage": false,
#   "application": true,
#   "payments": false
# }

davidberenstein1957 avatar Aug 16 '23 12:08 davidberenstein1957

Of course, any contribution that improves the documentation is greatly appreciated!

rlouf avatar Aug 16 '23 12:08 rlouf

@davidberenstein1957 do you need help on this?

rlouf avatar Oct 06 '23 09:10 rlouf

Hi Remi,Programmatically, no. But un the sense of usability yes. I'll share a brief example later but the approach does not seem to work properly during emperic evaluation. Reading the paper, it might not be the correct approach. What do you think?

davidberenstein1957 avatar Oct 07 '23 08:10 davidberenstein1957

Please share and I'll take a look!

rlouf avatar Oct 08 '23 06:10 rlouf

I skimmed to the paper and realized that the regex-like generation will not work, but I played around with the json generation, which proved useful for this usecase. Do you think it makes sense to add an example to your readme via a PR?

Multiple choices (multi-label)

from pydantic import BaseModel

import outlines.models as models
import outlines.text.generate as generate

model = models.transformers("gpt2")

class Topic(BaseModel):
    new_card: bool = False
    mortgage: bool = False
    application: bool = False
    payments: bool = False

sequence = generate.json(model, Topic)("I want to a new card bank card at my bank")
# {
#   "new_card": true,
#   "mortgage": false,
#   "application": true,
#   "payments": false
# }

trying this reveals that the model does not care to provide an answer, it's possible it replies with all choices set to False. Can we mark it somehow required to answer at least with one option=

chris-aeviator avatar Apr 08 '24 08:04 chris-aeviator