guidance
guidance copied to clipboard
Is it possible to get `select` logprobs like in versions <0.1.0?
Is your feature request related to a problem? Please describe. In old (handlebar) versions one could do something like:
program = guidance("The quick brown fox jumps over the lazy {{select 'animal' options=valid_animals logprobs='animals_logprobs'}}")
returned = program(valid_animals=["dog", "cat"])
print(returned["animals_logprobs"])
to get the relative logprobs of each option (i.e. in this case a dict of length 2). Is that possible?
Describe the solution you'd like
Something similar to the above maybe.
Describe alternatives you've considered
llm._cache_state["logits"]
is not the same thing -- only provides next-token logits, as I understand it.
EDIT:
Ah, just found this comment - https://github.com/guidance-ai/guidance/blob/cf355c7ac12ce7ce9ddddea115329d7ec9eeb939/guidance/_grammar.py#L489C6-L489C6 - so maybe not.
I would be interested in implementing this if the owners are amenable, but would maybe just need a little bit of guidance (ha) to better understand the repo and whether or not the <0.1 implementation should just be pulled forward or whether something new needs to happen.
Happy to help implement this
@wjn0 have you figured out how to do this yet? Not sure if this is the same thing as you're asking for, but I found a hacky way that works for me using llamacpp. Tinkering around in a notebook I noticed setting compute_log_probs=True and echo=True shows the output and hovering on each token showed the log_probs. So the information is there, and since I can't find anything in the docs / too lazy to change code in the library, I simply parsed the spans from the html to get the token and prob.
def extract_spans_from_html(html_content):
"""
Extracts all <span> elements from the given HTML content.
Parameters:
- html_content (str): A string containing HTML content.
Returns:
- list of str: A list containing the text of each <span> element found.
"""
soup = BeautifulSoup(html_content, 'html.parser')
spans = soup.find_all('span')
log_dict = []
for i in spans:
log_dict.append({
'text': i.text,
'prob': i.attrs['title']
})
return log_dict
spans = extract_spans_from_html(lm._html())
Edit: nevermind, clearly this is not what you were looking for as it will only show the probs for the output and not the alternative.