keepsake icon indicating copy to clipboard operation
keepsake copied to clipboard

Version control for machine learning

Results 110 keepsake issues
Sort by recently updated
recently updated
newest added

# Problem It's very long, which makes sense because it's a comprehensive inspection of all the data about an experiment/checkpoint. But, the most useful information is at the top, which...

type/enhancement

# Why? It is possible to record params when creating an experiment, and it is possible to record metrics when creating a checkpoint, but sometimes you need to record the...

type/roadmap

[We are seeing failures where the heartbeat is invalid JSON.](https://github.com/replicate/replicate/runs/1622823932) This implies writes are incomplete on read. Any writes to disk storage should be atomic. Using e.g. https://github.com/google/renameio Blocked on...

type/bug
priority/medium

# Why Experiments in Replicate are currently just a "bundle" of experiments, not related to each other. More often than not when running an experiment, you are building off a...

type/roadmap

# Why You can add params to experiments when starting them, but you might also want to add metadata after the fact to annotate them with. For example: - "bad"...

type/roadmap

Currently, running `make develop` and `make test` installs python packages via the default system python (if a virtual env is not setup). Ideally, there should be an optional step to...

type/chore

# Why Since ML models are often slow and expensive to train, we tend to spend a lot of time fine tuning computational performance. If we run our own servers...

type/roadmap

I thought we were tracking Python version but we're not. Some things off the top of my head: - Python version - Operating system - Architecture - CUDA version -...

good first issue
help wanted

# Why Currently we just output simple matplotlib charts. It would be nice to have some interactive plots for: - Viewing data when hovering - Updating output without editing code...

type/roadmap

# Why We record files, but they don't show up in `replicate diff`. This would be useful to understand what changes in code/files caused changes in your metrics. # How...

type/roadmap