metaflow icon indicating copy to clipboard operation
metaflow copied to clipboard

More examples and tutorials for metaflow

Open xzenggit opened this issue 5 years ago • 7 comments

Thanks a lot for open-sourcing this great library. Is it possible to provide more real-world examples of using this tool? It would be really helpful to have a real-world example that goes through a whole Data Science or Machine Learning project life cycle, such as data loading/cleaning, parameter tuning, model deployment and performance monitoring. Many thanks!

xzenggit avatar Dec 11 '19 21:12 xzenggit

@xzenggit Yes indeed! Providing case studies is high up on our fast-follow list. We should have something available really soon.

savingoyal avatar Dec 11 '19 21:12 savingoyal

From #69

It would be really great with at least one example on how to use the library to do ML predictions. It would be great if it showed the following use cases

Batch prediction (Fairly obsious) Real time prediction for instance in a rest service (Not as obvious)

elben10 avatar Dec 19 '19 19:12 elben10

I would really appreciate more tutorials. I'm a student and I'm trying to implement metaflow into a project, but I use complex classes and inheritance. At the moment I don't understand how to implement metaflow into my reusable classes, which do more than one thing.

  • Is inhertance available with metaflow? I would have two flows, two start functions and so on. That doesn't seem to work.
  • Can I implement more than one flow into one class? Same problem as with inheritance. Example: Training and predicting.

Machuntox avatar Jan 02 '20 08:01 Machuntox

@Machuntox, I am not quite sure what you would like to do. In Metaflow, one class is one flow. If you would like to reuse portions of code between flows, I would recommend you implement the common functionality in regular methods in a non-Metaflow Python class and then call that from your various flows.

@elben10: We plan on open sourcing our hosting solution which would allow you to do a REST service easily but it is not yet available (see #3)

romain-intel avatar Jan 05 '20 02:01 romain-intel

I think I am in a similar situation to @Machuntox. My project has a complex class hierarchy, and the problem that I'm facing is that the flow steps that I tried to implement on the parent class are not recognized.

pdMa2s avatar Feb 10 '20 15:02 pdMa2s

@Machuntox @pdMa2s Here are some thoughts on organizing more complex projects.

I like to implement complex business logic in a separate class that does not inherit from FlowSpec. I like to make it completely stateless (e.g. no attributes, purely functional), the class just organizes your complex model training and scoring logic, and any other 'business logic'. You can test this code in a notebook to make sure it works as expected, independent of your Flow:

Example:

class MyModelClass(object):
     def init_model(self, ...):
          ...
          return model_object

     def train(self, model_object, data, ...):
          ...
          return model_object

     def score(self, model_object, data, ...):
          scores = ...
          return scores

You can then re-use this class in your different workflows as a mixin (e.g. inherit from FlowSpec and then your custom class). Since your custom class is purely functional, the state can be nicely managed in you flow. This keeps the flow more about the code narrative and the state management, where the model and business details are in your reusable custom class:

class MyTrainingFlow(FlowSpec, MyModelClass):
     @step
     def start(self):
          self.model = self.init_model(...)
          self.next(self.train_model)

     @step
     def train_model(self):
          self.model = self.train(self.model, ...)
          self.next(self.end)

     @step
     def end(self):
          pass

And similarly, for scoring:

class MyScoringFlow(FlowSpec, MyModelClass):
     @step
     def start(self):
          training_run = Flow('MyTrainingFlow').latest_successful_run
          self.model = training_run['end'].task.data.model
          self.next(self.score_model)

     @step
     def score_model(self):
          self.scores = self.score(self.model, ...)
          self.next(self.end)

     @step
     def end(self):
          pass

When we release our hosting solution, you can use the custom class and data artifacts in a similar way.

bergdavidj avatar Feb 10 '20 18:02 bergdavidj

@bergdavidj Thank you for taking your time to compose such a detailed response. I'll take those suggestions into consideration.

pdMa2s avatar Feb 17 '20 14:02 pdMa2s

I will close this issue since we now have many tutorials available at https://outerbounds.com/docs/data-science-welcome/

savingoyal avatar Mar 29 '23 16:03 savingoyal