Merlin
Merlin copied to clipboard
Update the Merlin examples to use a shared Docker volume
Keeping some notes as I step through the process:
- I think in models/example/04... we need the foll change:
# from
workflow.save('workflow')
# to
workflow_path = os.path.join(DATA_FOLDER, "workflow")
workflow.save(workflow_path)
- Apt wasn't able to install tree. My guess is that the sources list or cache is removed. Running
apt updateresolved it. - Sub
!tree {workflow_path} - Ditto the first bullet with the model:
model_path = os.path.join(DATA_FOLDER, "dlrm")
model.save(model_path)
For the systems example, if we're considering being prescriptive about the paths, then maybe change from:
tritonserver --model-repository=/ensemble_export_path/ --backend-config=tensorflow,version=2
to
tritonserver --model-repository=/workspace/data/ensemble ...
and indicate to specify a different path if you stepped off the happy path earlier.
A little later, it seems that "test" was changed to "valid" in the 04 notebook from models. I think.
from merlin.core.dispatch import get_lib
df_lib = get_lib()
#original_data_path = os.environ.get("DATA_FOLDER", "/workspace/data/")
original_data_path = os.environ.get("INPUT_FOLDER", "/workspace/data/")
# read in data for request
batch = df_lib.read_parquet(
os.path.join(original_data_path,"valid", "part.0.parquet"), num_rows=3, columns=workflow.input_schema.column_names
)
batch
I just learned that embedded graphics don't work with MyST-NB, like I've been planning to use. The tree command is shown in 04 from models and can replace the graphic. My thinking is to add the foll:
!apt update > /dev/null 2>&1
!apt install tree > /dev/null 2>&1
!tree -L 2 {export_path}
That produces and OK output:
/workspace/data/ensemble
├── 0_transformworkflow
│ ├── 1
│ └── config.pbtxt
├── 1_predicttensorflow
│ ├── 1
│ └── config.pbtxt
└── ensemble_model
├── 1
└── config.pbtxt
6 directories, 3 files
When we move to the new container structure, do we still need a shared docker volume?
@bschifferer normally we should not, but we did not test anything with the new containers yet.
@mikemckiernan , can we close this ?
I guess this is related tot he old examples. We havent had time working on them, yet
Closing. This idea is generally sound and was especially compelling when we had separate training and inference containers. Now that we have containers that can perform both training and inference, it's less compelling and customers might not immediately recognize the benefit of storing data in a shared volume.