cerebros-core-algorithm-alpha icon indicating copy to clipboard operation
cerebros-core-algorithm-alpha copied to clipboard

second attempt add support for tf Dataset

Open david-thrower opened this issue 2 months ago • 0 comments

The problem

  • Unfortunately, we can't use a separate dataset for the x and y args for tf.keras.Model.fit().
  • This makes us unable to run this on out of memory generators (like the generator that streams the auto-regressive expansion of tokenized text).
  • This precludes training the LLM on a reasonable sized dataset ... unless we want to pay for a TB of RAM...

The feature:

Allow x=foo and y=bar to be passed or dataset=Dataset() # that pachages foo and bar as x and y ... and selectively run .fit based on the presence or absence of these mutex param configurations.

david-thrower avatar Sep 15 '25 19:09 david-thrower