[Feature] Add predict console command

Open nxexox opened this issue 2 years ago • 0 comments

Problem

I don't always need to run a web application. Sometimes I want to use a model one-time in my pipelines. For example, using a bash command, to which you can give data for prediction.

Solution

Add a bash command predict, which will load the model into memory, and use it to make predictions on data from arguments.

I propose to support two use cases:

Predict from a file - parse the file and forward it to the model line by line.
Predict from command arguments - forward one or more lines of data directly to the arguments.

# Examples from file with data
mlup predict -m my_model.onnx -f [--file] path_to_file_with_data.csv -o [--output] path_to_result_file.csv

mlup predict -c path_to_mlup_conf.yaml -f [--file] path_to_file_with_data.csv -o [--output] path_to_result_file.csv

# Examples from bash command arguments with data
mlup predict -m my_model.onnx -r [--raw-string] "1,2,3,4,5,6,7,2,1" -o [--output] path_to_result_file.csv

mlup predict -c path_to_mlup_conf.yaml -r [--raw-string] "1,2,3,4,5,6,7,2,1" -o [--output] path_to_result_file.csv

# Examples with output to console
mlup predict -c path_to_mlup_conf.yaml -r [--raw-string] "1,2,3,4,5,6,7,2,1"
# Print predict result to stdout

Due to the fact that the prediction result can be displayed on the console, you need to learn how to enable/disable logging of loading and model operation. Because the result of the prediction from stdout can be used in the bash command chain. By default, output=stdout disables logging.

Use the -v flag:

-v - only the result of the command.
-vv - INFO logging of loading and operation.
-vvv - DEBUG logging of loading and operation.

# Print only predict result
mlup predict -m my_model.onnx -f [--file] path_to_file_with_data.csv
# predict result

# Print INFO log
mlup predict -m my_model.onnx -f [--file] path_to_file_with_data.csv -vv

# Print DEBUG log
mlup predict -m my_model.onnx -f [--file] path_to_file_with_data.csv -vvv

It might be worth adding a 3rd block to the configuration file, which will contain:

Settings for batching lines from a file for prediction. How many objects from the file can be sent to the model.
File parser settings. Support csv, tsv, etc. Parsing settings for these formats.
Settings for saving the model result.

version: 1
ml:
    ...
web:
    ...

console:
    parser: "mlup.console.parsers.csv.CSVParser"
    parser_kwargs: 
        separator: ","
    batching: 10
    ...

Oct 23 '23 15:10 nxexox