yocto-gl icon indicating copy to clipboard operation
yocto-gl copied to clipboard

Runs created programmatically and from the UI are not used the same way on the UI for prompt engineering

Open GhaithDek opened this issue 11 months ago • 5 comments

Discussed in https://github.com/mlflow/mlflow/discussions/11423

Originally posted by GhaithDek March 14, 2024 when we create a run on the UI using prompt engineering, in the evaluation tab, we see the info like the model, max tokens, temperature and template are logged as variables of the run, which makes it possible to evaluate on new data in rows, by clicking on evaluate all button, this in not possible when we create runs programmatically, at least as far as I know

GhaithDek avatar Mar 14 '24 13:03 GhaithDek

The prompt engineering UI has a lot of stuff logged automatically, but it should be fully possible to log all of those things manually as well.

cc @serena-ruan, does the langchain autologging feature support logging stuff like max tokens, temperature, template, etc?

daniellok-db avatar Mar 15 '24 00:03 daniellok-db

@daniellok-db thanks for getting back. Yes I log manually info as well, but the problem is that it's not logged the same way as when the run is created on the UI. For example in the attached image we see the variables like max_tokens and prompt as variables of the run, we don't see that when the run is created programmatically. image

GhaithDek avatar Mar 15 '24 02:03 GhaithDek

Ah, I see your point now. cc @prithvikannan what's required for the prompt engineering UI to show up in the evaluate tab? Is it possible for the user to construct the artifacts manually?

daniellok-db avatar Mar 15 '24 06:03 daniellok-db

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.

github-actions[bot] avatar Mar 22 '24 00:03 github-actions[bot]

I think this is related to https://github.com/mlflow/mlflow/blob/3a1d12fa77be8b7632bb526fa4299fafb83a4a6d/mlflow/server/js/src/experiment-tracking/components/prompt-engineering/PromptEngineering.utils.ts#L88

With mlflow runs created like this:

with mlflow.start_run(tags={
    "mlflow.runSourceType": "PROMPT_ENGINEERING"
}):
    mlflow.log_param("model_route", "chat")
    mlflow.log_param("max_tokens", 100) 
    mlflow.log_param("temperature", 0.9)
    mlflow.log_param("route_type", "mlflow_deployment_endpoint")
    mlflow.log_param("prompt_template", "<your prompt with {{ template_variables }} used>")

You'll see the temperature / max tokens etc in the UI: Screenshot 2024-04-29 at 12 47 56

You can preview the evaluations from the UI but saving of the evaluations does not work for me.

marrrcin avatar Apr 29 '24 10:04 marrrcin