faculty
faculty copied to clipboard
Experiments resource-based interface
This PR prototypes adding a resource-based interface for experiment runs, with the following features:
- New primary models using attrs with class methods for querying / getting resources
- Convert client models to 'resource interface' models
- A query return type that behaves as an iterable but can provide extra methods like
as_dataframe
for other view types
Notes:
- I discussed with @zblz about using dataclasses but we agreed that attrs was a reasonable substitute allowing us to maintain Python 2.7 support.
- I've not done anything clever with any of:
- Caching the session / client
- Looking up the project ID when not set
- Only part of the run objects are currently mapped to df columns
- What are the columns in the output df called when metrics / params / tags have conflicting names?
Example usage:
from faculty.experiments import ExperimentRun
print(
list(
ExperimentRun.query(project_id="f20f5eaa-9cff-4216-8ebd-b88fd294bb3e")
)
)
print(
ExperimentRun.query(
project_id="f20f5eaa-9cff-4216-8ebd-b88fd294bb3e"
).as_dataframe()
)
- Caching the session / client
I talked to @zblz about this. In faculty.client
we call faculty.session.get_session
which already handles caching of the session. I think it's not worth the effort to implement caching just for the client, so I would keep the implementation as is?
- Looking up the project ID when not set
Sorry my confusion; by this you mean using the PROJECT_ID env variable by default if an argument is not passed?
- What are the columns in the output df called when metrics / params / tags have conflicting names?
I am not very experienced with pandas, my first thought was to namespace keys by metrics.
/ params.
/ tags.
, but I don't know if this would be a nuisance in practice?
The checks are not passing because apparently yield from
in faculty/experiments.py
(line 62) is not valid Python 2 syntax. I am not really sure how to address this?