yocto-gl [FR] Grading Notes

[FR] Grading Notes

Open shabie opened this issue 6 months ago • 1 comments

Willingness to contribute

Yes. I would be willing to contribute this feature with guidance from the MLflow community.

Proposal Summary

Hello everyone.

This blog from Databricks introduces the idea of grading notes: https://www.databricks.com/blog/enhancing-llm-as-a-judge-with-grading-notes.

Basically the idea is that for many use cases writing full reference answers as a basis for comparison (for LLM-as-a-judge) is very time-consuming. Instead providing pointers to the judge on what to look out for in the form of short notes, while still manual, is still easier.

Since I can relate to this idea a lot and find the solution simple and yet very useful, I would like to work on this and make a PR.

Motivation

What is the use case for this feature?

This is used for LLM output evaluation.

Why is this use case valuable to support for MLflow users in general?

MLflow provides multiple evaluation possibilities and this idea extends it by allowing human preference to guide the LLM judge.

Why is this use case valuable to support for your project(s) or organization?

For code generation work that I have done, providing high-level guidelines to the LLM judge on what this task solution should look like is much more scalable than writing out fully working code as a reference.

Why is it currently difficult to achieve this use case?

It isn't "difficult" per se but providing support for it out of the box will 1) make it even simpler to handle 2) expose people to this idea who may not have discovered it otherwise.

Details

No response

What component(s) does this bug affect?

[ ] area/artifacts: Artifact stores and artifact logging
[ ] area/build: Build and test infrastructure for MLflow
[ ] area/deployments: MLflow Deployments client APIs, server, and third-party Deployments integrations
[ ] area/docs: MLflow documentation pages
[ ] area/examples: Example code
[ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
[ ] area/models: MLmodel format, model serialization/deserialization, flavors
[ ] area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
[ ] area/projects: MLproject format, project running backends
[ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
[ ] area/server-infra: MLflow Tracking server backend
[ ] area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

[ ] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
[ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
[ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
[ ] area/windows: Windows support

What language(s) does this bug affect?

[ ] language/r: R APIs and clients
[ ] language/java: Java APIs and clients
[ ] language/new: Proposals for new client languages

What integration(s) does this bug affect?

[ ] integrations/azure: Azure and Azure ML integrations
[ ] integrations/sagemaker: SageMaker integrations
[ ] integrations/databricks: Databricks integrations

Aug 02 '24 11:08 shabie

yocto-gl yocto-gl copied to clipboard

[FR] Grading Notes

Willingness to contribute

Proposal Summary

Motivation

What is the use case for this feature?

Why is this use case valuable to support for MLflow users in general?

Why is this use case valuable to support for your project(s) or organization?

Why is it currently difficult to achieve this use case?

Details

What component(s) does this bug affect?

What interface(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

yocto-gl
yocto-gl copied to clipboard