remake icon indicating copy to clipboard operation
remake copied to clipboard

Best practices for recursive builds

Open tcholewik opened this issue 8 years ago • 2 comments

I'm working a project where I have to compile series of reports. While I intended to have one report for each day, and I need incremental intraday reports as well. To build intraday reports I can either execute queries that collect data from midnight untill now, or I could check what time did last report run and my query will pull just additinal data.

For now I can just query whole day, but second solution offers a puzzle, I wonder if it is possible to do in remake.

If I generated report_today.html and it used data_today.csv then to query just new data I would have a step that checks data_today.csv for most recent record timestamp and use that as an input for query what would say something like SELECT * FROM SOMETABLE WHERE TIME > {{MOST_RECENT_RECORD}}. At this point I can query the database as append results to data_today.csv.

What wories me is that in setup described above data_today.csv is both a dependecy for first step and a taget file for last step, so before as we finish running this workflow we already invalidated dependency of step 1.

So my questions are:

  1. Is remake prepared to handle this situation?
  2. Is there a way to decouple this target/dependency relationship?
  3. What are the best practices for handling this?

tcholewik avatar Dec 11 '17 21:12 tcholewik

What if you write to data_today.csv as a side effect instead of declaring it as a target?

wlandau-lilly avatar Dec 11 '17 21:12 wlandau-lilly

I suppose thats one way to do it. That way I assume we'd always have to call data loading targat manually, since remake does not support phony targets yet.

tcholewik avatar Dec 11 '17 21:12 tcholewik