vowpal_wabbit icon indicating copy to clipboard operation
vowpal_wabbit copied to clipboard

Python and audit support

Open rushter opened this issue 4 years ago • 4 comments

Hi,

Is there a way to get audit results as a string?

I found the audit_example method, but it outputs to stdout.

import vowpalwabbit.pyvw
model = vowpalwabbit.pyvw.vw('-t -i model.vw --json --audit')
example = model.parse(example_as_str)
ouput = model.audit_example(example)

I see that audit is mentioned in Python docstrings, but there are no examples or documentation on how to retrieve results.

        """Initialize the vw object.

        Parameters
        ----------

        arg_str : str
            The command line arguments to initialize VW with,
            for example "--audit". By default is None.

rushter avatar Feb 19 '21 15:02 rushter

There is not currently a way to do this unfortunately. However, it is clearly highly valuable and something I think we should do.

jackgerrits avatar Feb 19 '21 15:02 jackgerrits

This could be achievable with the following bits:

  • Avoiding the std:cout https://github.com/VowpalWabbit/vowpal_wabbit/blob/4ccc76046256d3c8e7bd46900139c9d472a28967/vowpalwabbit/gd.cc#L305

  • instantiating vw with enable_logging=True https://github.com/VowpalWabbit/vowpal_wabbit/blob/4ccc76046256d3c8e7bd46900139c9d472a28967/python/tests/test_cb.py#L41

  • and then calling output = vw.get_log()

lalo avatar Feb 19 '21 15:02 lalo

Hi, doing

from vowpalwabbit import pyvw
model = pyvw.vw('--audit')
ex = model.parse("0:0.1:0.75 | a:0.5 b:1 c:2")
output = model.audit_example(ex)

gives me a:92594:0.5:0@0 b:163331:1:0@0 c:185951:2:0@0 Constant:116060:1:0@0 as stdout which does not store it as a string. So I tried adding what @lalo suggested and model.get_log() gives me

['Num weight bits = 18\n',
 'learning rate = 0.5\n',
 'initial_t = 0\n',
 'power_t = 0.5\n',
 'using no cache\n',
 'Reading datafile = \n',
 'num sources = 1\n',
 'Enabled reductions: gd, scorer\n',
 'average  since         example        example  current  current  current\n',
 'loss     last          counter         weight    label  predict features\n']

while this output doesn't match @bassmang @jackgerrits

priyanshuone6 avatar Dec 12 '21 12:12 priyanshuone6

Eduardo pointed out that with some work it may be accessible from the logs. However, as it is right now audit only goes to stdout. This is a feature request to make it accessible from Python.

jackgerrits avatar Jan 10 '22 18:01 jackgerrits