guidance
guidance copied to clipboard
Terminal colors
This PR:
- Automatically detects whether the user is running in the command-line or in a notebook
- When in a notebook,
echo=Truegives a pretty HTML display - When in the command-line, raw text is printed to stdout (no more
<IPython.core.display.HTML object>!)
- When in a notebook,
- Uses ANSI color codes to color generated text
- Green by default
- If
coloredpackage is installed (added as optional dependency), colors will depend on log-probabilities of generated tokens
Thank you @dimondjik for your implementation!
In action:
@Harsha-Nori @dimondjik :)
So happy to see this! I think it looks wonderful.
Would appreciate @nopdive 's review given Sam's experience in notebook env detection
Also, just a thought -- might be fun to have the commit co-authored by @dimondjik if comfortable!
Could this be refactored? It looks like it has a potentially fragile regex for parsing out styling information, and is sitting as another stage in the pipeline.
Should Engine acquire another property such as output_processor which would be some abstract base class (or rather, some reasonable Python facsimile thereof). Wherever our current code writes some output, it calls the appropriate method on the class. We then have implementations of the output_processor for Python Notebooks, console output etc.
This would also let us put all the styling information in the output_processor itself, rather than have it scattered across the main codebase.
Could this be refactored? It looks like it has a potentially fragile regex for parsing out styling information, and is sitting as another stage in the pipeline.
Should
Engineacquire another property such asoutput_processorwhich would be some abstract base class (or rather, some reasonable Python facsimile thereof). Wherever our current code writes some output, it calls the appropriate method on the class. We then have implementations of theoutput_processorfor Python Notebooks, console output etc.This would also let us put all the styling information in the
output_processoritself, rather than have it scattered across the main codebase.
Reasonable concerns and a nice suggestion. It would be nice to encapsulate the existing pretty-output code regardless. I'll see what I can do 😎
So happy to see this! I think it looks wonderful.
Would appreciate @nopdive 's review given Sam's experience in notebook env detection
Also, just a thought -- might be fun to have the commit co-authored by @dimondjik if comfortable!
@Harsha-Nori do you know if it's possible to add a co-author during the squash-and-merge process? Otherwise, I really think that the first commit on this PR was 99.99% @dimondjik ... maybe I can amend the history of this branch to properly credit them as an author/contributor :)
Nice! First of all, I wasn't expecting it to get anywhere at all, I'm glad I could help! @hudson-ai I thank you for the PR from all the people working in terminal and not seeing colors like Notebook people XD
I've read the thread and I absolutely 100% sure that @riedgar-ms 's suggestion is the way.
But, I live by the motto "don't touch something that works", so from my perspective trying to rewrite something in Engine, when we can just add extra layer (yes, it is making the output somewhat slow, sadly), so two and a half programmers can see colors in terminal when debugging. I believe in production - colors in terminal is not that important. And also I am too lazy to dig inside the code.
So, I'm not here to just talk, when I was playing with the llm I found out that this approach "eats" any html-like tag, even generated by the llm. I.e. Mistral's "<s></s>".
So here's my local version I was updating for myself:
class ModelStateHTMLParser:
def __init__(self):
self.__prev_data_len = 0
self.__ptr = (
re.compile(r"<\|\|_html:([\S\s]+?)_\|\|>|([\S\s]+?)(?=<\|\|_html:|\Z)"))
self.__color_param_ptr = (
re.compile(r"rgba\(([0-9]+?\.??[0-9]*?),\s*?([0-9]+?\.??[0-9]*?),\s*?([0-9]+?\.??[0-9]*?)"))
def feed(self, data):
data_trunc = data[self.__prev_data_len:]
self.__prev_data_len = len(data)
for res in self.__ptr.finditer(data_trunc):
if res.group(1) is not None:
try:
r, g, b = [round(float(n)) for n in self.__color_param_ptr.search(res.group(1)).groups()]
print('\033[38;2;{};{};{}m'.format(r, g, b), end='', flush=True)
except AttributeError:
print('\033[0m', end='', flush=True)
if res.group(2) is not None:
print(res.group(2), end='', flush=True)
- It uses colors from the html markdown dependency-less, just ANSI escape
- Almost bulletproof regex for tag detection
- Last length of the prompt is stored inside the class now, otherwise what's the point not making "feed" static
@hudson-ai I'm ready to push to terminal_colors, just give me the green light XD
Anyways, you guys are awesome, best wishess
Marked this as a draft, as there is clearly a good amount of work left to be done.
@dimondjik thank you for the additions! Definitely good to be careful about eating html tags produced by the model, and I like that this allows us to drop the dependency I added.
Also as a terminal user rather than a notebook user, I'm super excited about this 😆
Maybe the best way for us to collaborate is for you to open a PR to my branch? No need to be super formal or anything :)
I'd just ask that you return from feed rather than printing so that we can handle I/O further up the stack.
:warning: Please install the to ensure uploads and comments are reliably processed by Codecov.
Codecov Report
Attention: Patch coverage is 93.33333% with 2 lines in your changes missing coverage. Please review.
Project coverage is 58.30%. Comparing base (
6e4ee06) to head (0b13ca8). Report is 285 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| guidance/_utils.py | 94.44% | 1 Missing :warning: |
| guidance/models/_model.py | 91.66% | 1 Missing :warning: |
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@ Coverage Diff @@
## main #905 +/- ##
==========================================
+ Coverage 55.54% 58.30% +2.76%
==========================================
Files 63 63
Lines 4733 4754 +21
==========================================
+ Hits 2629 2772 +143
+ Misses 2104 1982 -122
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
@nopdive re: colored/colorama, @dimondjik shared some code that should let us avoid adding a dependency at all (although I want to triple check for platform differences when it comes to ansi code support)