mlflow icon indicating copy to clipboard operation
mlflow copied to clipboard

[FR] `log_text` support for arbitrary plain text file extensions

Open edemattos opened this issue 3 years ago • 4 comments

Willingness to contribute

Yes. I would be willing to contribute this feature with guidance from the MLflow community.

Proposal Summary

The UI does not render plain text for artifacts logged with a custom file extension. MLflow.log_text() will write the file to the mlruns directory and appear in the UI, but the file contents are not displayed. Just the message: "Select a file to preview. Supported formats: image, text, html, pdf, geojson files."

Motivation

What is the use case for this feature? Why is this use case valuable to support for MLflow users in general?

This would enhance the existing feature to accommodate any plain text file not limited to .txt or .log.

Why is this use case valuable to support for your project(s) or organization?

We use a few custom file extensions to identify certain output types, and would like to log all outputs in a single place.

Why is it currently difficult to achieve this use case?

If I simply change our custom extensions to .txt or .log when logging, then the contents are properly shown in the UI. But this necessitates writing the file twice to comply with our internal tools while also being able to easily view them in the UI alongside the rest of the model outputs.

Details

No response

What component(s) does this bug affect?

  • [X] area/artifacts: Artifact stores and artifact logging
  • [ ] area/build: Build and test infrastructure for MLflow
  • [ ] area/docs: MLflow documentation pages
  • [ ] area/examples: Example code
  • [ ] area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • [ ] area/models: MLmodel format, model serialization/deserialization, flavors
  • [ ] area/projects: MLproject format, project running backends
  • [ ] area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • [ ] area/server-infra: MLflow Tracking server backend
  • [ ] area/tracking: Tracking Service, tracking client APIs, autologging

What interface(s) does this bug affect?

  • [X] area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • [ ] area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • [ ] area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • [ ] area/windows: Windows support

What language(s) does this bug affect?

  • [ ] language/r: R APIs and clients
  • [ ] language/java: Java APIs and clients
  • [ ] language/new: Proposals for new client languages

What integration(s) does this bug affect?

  • [ ] integrations/azure: Azure and Azure ML integrations
  • [ ] integrations/sagemaker: SageMaker integrations
  • [ ] integrations/databricks: Databricks integrations

edemattos avatar May 22 '22 22:05 edemattos

@sunishsheth2009 @xanderwebs Do you have any thoughts here about rendering of arbitrary artifacts? Is it safe to attempt to render them as text?

dbczumar avatar May 25 '22 00:05 dbczumar

Seems okay as long as we're making sure we escape the text properly to avoid code injections

xanderwebs avatar May 25 '22 14:05 xanderwebs

Agree with Alex, we need to make sure the text is HTML safe using something like html_sanitize.

sunishsheth2009 avatar May 25 '22 16:05 sunishsheth2009

@harupy Any updates on https://github.com/mlflow/mlflow/issues/5948?

dbczumar avatar Aug 09 '22 18:08 dbczumar