tango icon indicating copy to clipboard operation
tango copied to clipboard

How to re-cache a step after modifying it?

Open BigRedT opened this issue 2 years ago • 9 comments

What is the recommended way to recache a step?

I tried deleting the cached file and re-running as recommended by the error message, but I keep getting the following error:

Screen Shot 2022-09-19 at 8 14 05 PM

BigRedT avatar Sep 20 '22 03:09 BigRedT

That might be a bug considering you did what it told you to do, and then you still got an error. @dirkgr is more familiar with the local workspace.

That said, if you want to recache the step because you changed/fixed something with how the step runs, the best way to do that is to change/set the VERSION class variable on your step subclass. For example:

class MyStep(Step):
    VERSION = "001"

epwalsh avatar Sep 20 '22 17:09 epwalsh

Thanks, setting the version number worked!

BigRedT avatar Sep 20 '22 17:09 BigRedT

Glad you got unblocked. Before you did the version number thing, did you remove the entire directory? Or did you just remove all files in it?

dirkgr avatar Sep 20 '22 17:09 dirkgr

I removed that directory

BigRedT avatar Sep 20 '22 17:09 BigRedT

@dirkgr any updates on this bug?

Even though version numbers are great, sometimes I want to recache a step because of a change somewhere else in the code that affects that step (e.g rerunning on different random samples). For that purpose it would actually be helpful to be able to delete a step's cache and recompute that particular step as well as all the following steps that depend on it.

BigRedT avatar Sep 28 '22 19:09 BigRedT

+1 for this. I ran into a similar situation where the step in question was defined in a dependency of my project (catwalk) so I couldn’t tick the version.

epwalsh avatar Sep 28 '22 19:09 epwalsh

Still not a big fan, but maybe add an API to the workspace that allows you to delete a cache entry, and then expose it in the CLI?

dirkgr avatar Oct 03 '22 18:10 dirkgr

the step in question was defined in a dependency of my project (catwalk) so I couldn’t tick the version

Why did you have to re-run it then?

dirkgr avatar Oct 03 '22 18:10 dirkgr

Also ran into a similar issue.

maybe add an API to the workspace that allows you to delete a cache entry, and then expose it in the CLI?

This would be useful. Removing the directory works in the case of local workspace, but for remote workspaces, not all of the information resides in the bucket. For instance, in GSWorkspace, step info information is in the datastore, which allows for better speeds, but also means that removing a run/step requires us to delete the bucket entry as well as the datastore entry.

AkshitaB avatar Jun 28 '23 19:06 AkshitaB