kepler-model-server
kepler-model-server copied to clipboard
Find spec/node_type of Kepler node for model selection
What would you like to be added?
Flow to link Kepler-deploying node specification to model selection from Kepler model DB.
Why is this needed?
Problem description
As previously, we have only a single node_type in the pipeline. We always put _1 after the trainer name to get the model name. However, with SPECPower and AWS instances, we can now train multiple node_type.
Currently, we have a function generate_spec to generate machine spec implemented in python on kepler-model-server.
Idea
The thing to do is to let Kepler determine know its node_type. The logic of generate_spec may not need to merge into inside Kepler. It can run in init container to generate spec and save to a file to mount. Server API may need to update to allow adding machine spec inside the request to select the model.
Note that,
- node_type is per pipeline determined by
node_type_index.json
inside the pipeline folder. - we can set default pipeline to spec_benchmark for acpi value and aws_instance_pipeline for rapl value.