maestrowf
maestrowf copied to clipboard
Give user ability to access yaml file from generator
When user creates a generator, currently it is not able to access data from the original yaml file.
Also the "variable" from env are parsed first which means we cannot put anything there for the generator to look as it disables the generator parameters.
Currently as a workaround I use sys.argv
to locate the original yaml file. There are two reason for this
- There is no other way to know the name of the original yaml file
- env sticking the name in an env variable, the original yaml file is not copied in
OUTPUT_PATH
when we get to the generator func
Other issue is that there is a specs verification done on global.parameters
which enforces all params to have the same length. I was able to by pass it by filling them with duplicates and using set
in my generator.
To make it easy to dev and try to disturb maestro the least possible I went with the following that I propose as a suggestion:
I created a generator.parameters
section in the yaml file. Currnetly I use sys.argv
to reparse the yaml file and get to this section, but I think having this section available to the user (via env or some other way) would make things really easier for the end user.
Also having the entire parsed content of the yaml file could be an even better solution as the generator might want to know about other things (like scheduling and others)
I'm pasting here my current soltuuin as an example/starting point
YAML FILE
description:
name: param grid sample test
description: A sample parameter grid search study
study:
- name: test_gen
description: Build the serial version of LULESH.
run:
cmd: |
echo $(TRIAL)_$(SIZE)_$(ITERATION)
depends: []
generator.parameters:
SIZE:
values: [1,2,3,4]
label: SIZE.%%
TRIAL:
values: [5,4]
label: TRIAL.%%
ITERATION:
values: [5,3,6]
label: ITERATION.%%
PYTHON FILE FOR GENERATOR*
import sys
from maestrowf.datastructures.core import ParameterGenerator
from sklearn.model_selection import ParameterGrid
import yaml
try:
from yaml import CLoader as Loader, CDumper as Dumper
except ImportError:
from yaml import Loader, Dumper
#
def get_custom_generator(env, **kwargs):
"""
Create a custom populated ParameterGenerator.
This function recreates the exact same parameter set as the sample LULESH
specifications. The point of this file is to present an example of how to
generate custom parameters.
:returns: A ParameterGenerator populated with parameters.
"""
p_gen = ParameterGenerator()
yml = yaml.load(open(sys.argv[-1]).read(), Loader=Loader)
print(yml)
p = {}
labels = {}
for k, val in yml["generator.parameters"].items():
print(k, val)
if isinstance(val["values"], (list,tuple)):
p[k] = set(val["values"])
else:
p[k] = [val["values"],]
labels[k] = val["label"]
grid = ParameterGrid(p)
p = {}
for g in grid:
for k in g:
if k not in p:
p[k] = [g[k],]
else:
p[k].append(g[k])
for k, val in p.items():
p_gen.add_parameter(k, val, labels[k])
return p_gen