transformers.js icon indicating copy to clipboard operation
transformers.js copied to clipboard

[Feature request] Logging Level and Progress Bar for Model Downloads

Open aress31 opened this issue 2 years ago • 9 comments

Right now, the output can be quite lengthy and verbose, see:

image

Could it be possible to expose logger options to granularly control the output as well as offering visual feedback for the status of the model download from Hugging Face?

aress31 avatar May 18 '23 20:05 aress31

I agree - that would be a great improvement! It was not originally considered since the library was designed for browsers, but since there has been a lot of interest for Node.js-like environments, it's definitely something to consider.

I think this would be a good first issue for someone who wants to contribute :)

xenova avatar May 19 '23 06:05 xenova

That is exactly my use case, that would be for a Node application. And as such, I need a way to provide some visual feedback on the different operations, something similar in essence to:

  • https://huggingface.co/docs/transformers/main_classes/logging

aress31 avatar May 19 '23 10:05 aress31

The screenshot log comes from WASM and you have to specify the log verbosity level in transformers.js/src/models.js, you can overwrite the constructSession function e.g. with this:

async function constructSession(pretrained_model_name_or_path, fileName, options) {
    // TODO add option for user to force specify their desired execution provider
    let modelFileName = `onnx/${fileName}${options.quantized ? '_quantized' : ''}.onnx`;
    let buffer = await getModelFile(pretrained_model_name_or_path, modelFileName, true, options);

    /** @type {InferenceSession.SessionOptions} */
    const extraSessionOptions = {
        logVerbosityLevel: 4,
        logSeverityLevel: 4,
    }
    try {
        return await InferenceSession.create(buffer, {
            executionProviders,
            ...extraSessionOptions
        });
    } catch (err) {
        // If the execution provided was only wasm, throw the error
        if (executionProviders.length === 1 && executionProviders[0] === 'wasm') {
            throw err;
        }

        console.warn(err);
        console.warn(
            'Something went wrong during model construction (most likely a missing operation). ' +
            'Using `wasm` as a fallback. '
        )
        return await InferenceSession.create(buffer, {
            executionProviders: ['wasm'],
            ...extraSessionOptions
        });
    }
}

Requires a bit discussion how the PR should then expose the session options to the transformers.js users :thinking:

kungfooman avatar May 19 '23 16:05 kungfooman

A global env.logLevel should be sufficient, right?

xenova avatar May 20 '23 08:05 xenova

I kinda lend to extending the options in pipeline(), because I would rather minimize the amount of global env state/variables :thinking:

https://github.com/microsoft/onnxruntime/blob/18f17c555d51caee83f15983c2620f463fbaddd1/js/common/lib/inference-session.ts#L126-L139

We could just make it a passthrough-options-ONNX-session-options-object, in the case that we need to use other session options later aswell, inside the existing pipeline options object.

Something like:

/**
 * Utility factory method to build a [`Pipeline`] object.
 *
 * @param {string} task The task of the pipeline.
 * @param {string} [model=null] The name of the pre-trained model to use. If not specified, the default model for the task will be used.
 * @param {PretrainedOptions} [options] Optional parameters for the pipeline.
 * @returns {Promise<Pipeline>} A Pipeline object for the specified task.
 * @throws {Error} If an unsupported pipeline is requested.
 */
export async function pipeline(
    task,
    model = null,
    {
        quantized = true,
        progress_callback = null,
        config = null,
        cache_dir = null,
        local_files_only = false,
        revision = 'main',
        ortExtraSessionOptions = {},
    } = {}
) {

But we could make the API easier to use aswell (e.g. using "verbose" instead of 0 because no one wants to memorize enums):

Do we even need both options, logVerbosityLevel and logSeverityLevel? It's not clear to me yet what the difference is :see_no_evil:

I have a slight feeling that other dev's are also a bit confused:

image

kungfooman avatar May 20 '23 15:05 kungfooman

Do we even need both options, logVerbosityLevel and logSeverityLevel? It's not clear to me yet what the difference is 🙈

My thought exactly. This is why a global logging level might be okay. It will also be better for when we add more backends (not just onnx). If someone REALLY wants those log levels, they can set them manually with env.backends.onnx.*

We could just make it a passthrough-options-ONNX-session-options-object, in the case that we need to use other session options later aswell, inside the existing pipeline options object.

I'm not too keen on adding that to the pipeline function, just because it's not something users will modify often. It will also then have to be used in AutoModel and similar locations.

xenova avatar May 20 '23 16:05 xenova

For people looking for the final answer / code. adding a nodejs code snippet here to set log level to 3 .

warning levels

VERBOSE = 0,
INFO = 1,
WARNING = 2,
ERROR = 3,
FATAL = 4

your code


import { pipeline, env } from '@xenova/transformers';
import fs from 'fs';
env.cacheDir = './.cache';
env.backends.onnx.logLevelInternal = 'error' // this line here

AmitDJagtap avatar Jul 12 '24 06:07 AmitDJagtap

Is there anything else to work on this feature? I would like to contribute but it is not clear to me if this needs further work. Thank you!

fcuenya avatar Oct 28 '24 09:10 fcuenya

@fcuenya we can configure the env.backends.onnx.logLevel of onnx, which should already control a lot of logging. but transformers.js also contains a few places, where there are logs using console.error or console.warn. controlling them using env would be the thing missing here as far as i understand.

TimPietrusky avatar Feb 19 '25 12:02 TimPietrusky