giskard icon indicating copy to clipboard operation
giskard copied to clipboard

[GSK-1955] Scan custom stdout

Open rabah-khalek opened this issue 5 months ago • 4 comments

Supersedes: https://github.com/Giskard-AI/giskard/pull/1549

Since the scan is our main API, it is very important to make it's output as readable as possible. For this, a set of templates and colors are already defined (see xprint.py and test_xprint.py).

Todo:

  • Wrap xprint into a class similar to logging.Logger that takes in three levels (Note that an attempt of inheriting from Logger was done here but wasn't really useful, let's create our own class) :
    • INFO: default that prints out useful logs (which detector is running, summaries, etc.)
    • DEBUG: default that prints out additional logs (content of prompts, failed examples, etc.)
    • CRITICAL: that only prints out errors and warnings if they occur
  • Removing the Catalog in favour of putting templates per module where they are needed
  • To take from the old PR: halting the user with a decision to make if they're scanning an LLM model (based on cost estimate): https://github.com/Giskard-AI/giskard/blob/cdc693fe77647737a3fed2845ef0f15f1a69fff2/giskard/scanner/scanner.py#L44-L57
  • Control verbosity level globally via the class, and do NOT propagate a verbose parameter everywhere.
  • A lot of examples to take from https://github.com/Giskard-AI/giskard/pull/1549 (prints already implemented in the detectors)

rabah-khalek avatar Jan 21 '24 09:01 rabah-khalek

@Sakayatsp, reflecting on all this, I think @Hartorn and @mattbit raised good points: I might've overcomplicated this. My aim was to automate the wrapping of an output element: "{reset}{font}{color}{" + PLACEHOLDER + "}{reset}" with an interface, but yeah likely it was an overkill.

Based on all the comments above, here's what I propose:

  • let's refactor ScanLogger into a ScanPrinter that handles two level of verbosities: LOW and HIGH.
  • let's re-assign all the critical stuff (warnings and errors), into the logging.Logger, that way we have a clear separation between what's printer and what's logger.
  • let's forget about the dumping into file
  • To simplify things, let's abandon the current Template and xprint interface. Let's just create aliases for the colors and simply call them directly.

An example of the current implementation:

class ScanPrinter:
    def high(
        self,
        *args,
        template: Optional[Template] = None,
        filename: Optional[str] = None,
        **kwargs,
    ):
        if self.level.value <= VerbosityLevel.HIGH.value:
            xprint(*args, template=template, filename=filename, **kwargs)

RED_COLOR = Fore.LIGHTRED_EX
BLUE_COLOR = Fore.LIGHTBLUE_EX
MAGENTA_COLOR = Fore.LIGHTMAGENTA_EX

RED_STYLE = Style(color=RED_COLOR)
BLUE_STYLE = Style(color=BLUE_COLOR)
MAGENTA_STYLE = Style(color=MAGENTA_COLOR)

DetectedIssues = Template(content="{}: {} issues detected. (Took {})", pstyles=[BLUE_STYLE, RED_STYLE, MAGENTA_STYLE])
printer.low(detector.__class__.__name__, num_issues, time, template=DetectedIssues)

here's what we can do instead:

class ScanPrinter:
    def high(
        self,
        *args,
        **kwargs,
    ):
        if self.level.value <= VerbosityLevel.HIGH.value:
            print(*args, **kwargs)

from colorama import Fore
from colorama import Style


RED = Fore.LIGHTRED_EX
BLUE = Fore.LIGHTBLUE_EX
MAGENTA = Fore.LIGHTMAGENTA_EX

R = Style.RESET_ALL
BOLD = Style.BRIGHT
BRED = BOLD + RED
BBLUE = BOLD + BLUE
BMAGENTA = BOLD + MAGENTA

printer.low(f"{BBLUE}{detector.__class__.__name__}{R}: {RRED}{num_issues}{R} issues detected. (Took {BMAGENTA}{time}{R})")

Although we'd lose the idea of one Template for similar outputs, I think it makes it simpler and more readable.

Before diving into it (you may want to do it in a new branch @Sakayatsp , up to you), @mattbit @Hartorn do you agree on this?

rabah-khalek avatar Feb 04 '24 13:02 rabah-khalek

@rabah-khalek should we close this PR ?

Hartorn avatar Feb 13 '24 09:02 Hartorn