Clarification or Improvement: Should preprocess_model_input Also Affect Inputs to Scorer Functions?

Open oekekezie opened this issue 9 months ago • 1 comments

Description

Currently, in the Weave evaluation framework (weave.flow.eval), the preprocess_model_input function provided by users only transforms inputs passed into the model's prediction function. Scorer functions, however, always receive the original, unprocessed input example directly from the dataset.

Current Behavior

preprocess_model_input transforms the input before it's passed to the model's predict function.
Scorer functions receive the unprocessed original dataset input, ignoring any preprocessing step--is my understanding correct?

Relevant Code Snippet:

# apply preprocessing for model input
apply_model_result = await apply_model_async(model, example, self.preprocess_model_input)

# scorer gets original input without preprocessing
for scorer in self.scorers:
    apply_scorer_result = await model_call.apply_scorer(scorer, example)

Issue

This behavior might lead to confusion, as users might intuitively expect scorers to also evaluate based on preprocessed inputs (especially if preprocessing involves essential normalization, cleaning, or formatting operations required by both the model and evaluation metrics).

Suggested Resolution

Clarify in documentation explicitly that scorer functions always receive original inputs.
Or consider adjusting the behavior to allow an option for scorers to receive preprocessed inputs, potentially through an additional argument or flag within the Evaluation class.

Additional Context

This clarification or adjustment will help users avoid subtle bugs or misunderstandings when setting up evaluations, especially in complex preprocessing scenarios.

Clarification or Improvement: Should preprocess_model_input Also Affect Inputs to Scorer Functions?

Description

Current Behavior

Issue

Suggested Resolution

Additional Context

Related Links