Merlin icon indicating copy to clipboard operation
Merlin copied to clipboard

Update the Merlin examples to use a shared Docker volume

Open karlhigley opened this issue 3 years ago • 5 comments

karlhigley avatar Apr 14 '22 15:04 karlhigley

Keeping some notes as I step through the process:

  • I think in models/example/04... we need the foll change:
# from
workflow.save('workflow')

# to
workflow_path = os.path.join(DATA_FOLDER, "workflow")
workflow.save(workflow_path)
  • Apt wasn't able to install tree. My guess is that the sources list or cache is removed. Running apt update resolved it.
  • Sub !tree {workflow_path}
  • Ditto the first bullet with the model:
model_path = os.path.join(DATA_FOLDER, "dlrm")
model.save(model_path)

mikemckiernan avatar Apr 14 '22 17:04 mikemckiernan

For the systems example, if we're considering being prescriptive about the paths, then maybe change from:

tritonserver --model-repository=/ensemble_export_path/ --backend-config=tensorflow,version=2

to

tritonserver --model-repository=/workspace/data/ensemble ...

and indicate to specify a different path if you stepped off the happy path earlier.

A little later, it seems that "test" was changed to "valid" in the 04 notebook from models. I think.

from merlin.core.dispatch import get_lib

df_lib = get_lib()
#original_data_path = os.environ.get("DATA_FOLDER", "/workspace/data/")
original_data_path = os.environ.get("INPUT_FOLDER", "/workspace/data/")

# read in data for request
batch = df_lib.read_parquet(
    os.path.join(original_data_path,"valid", "part.0.parquet"), num_rows=3, columns=workflow.input_schema.column_names
)
batch

I just learned that embedded graphics don't work with MyST-NB, like I've been planning to use. The tree command is shown in 04 from models and can replace the graphic. My thinking is to add the foll:

!apt update > /dev/null 2>&1
!apt install tree > /dev/null 2>&1

!tree -L 2 {export_path}

That produces and OK output:

/workspace/data/ensemble
├── 0_transformworkflow
│   ├── 1
│   └── config.pbtxt
├── 1_predicttensorflow
│   ├── 1
│   └── config.pbtxt
└── ensemble_model
    ├── 1
    └── config.pbtxt

6 directories, 3 files

mikemckiernan avatar Apr 14 '22 17:04 mikemckiernan

When we move to the new container structure, do we still need a shared docker volume?

bschifferer avatar May 05 '22 07:05 bschifferer

@bschifferer normally we should not, but we did not test anything with the new containers yet.

rnyak avatar May 05 '22 12:05 rnyak

@mikemckiernan , can we close this ?

viswa-nvidia avatar Aug 11 '22 00:08 viswa-nvidia

I guess this is related tot he old examples. We havent had time working on them, yet

bschifferer avatar Aug 11 '22 07:08 bschifferer

Closing. This idea is generally sound and was especially compelling when we had separate training and inference containers. Now that we have containers that can perform both training and inference, it's less compelling and customers might not immediately recognize the benefit of storing data in a shared volume.

mikemckiernan avatar Sep 12 '22 17:09 mikemckiernan