cerebros-core-algorithm-alpha
cerebros-core-algorithm-alpha copied to clipboard
second attempt add support for tf Dataset
The problem
- Unfortunately, we can't use a separate dataset for the x and y args for tf.keras.Model.fit().
- This makes us unable to run this on out of memory generators (like the generator that streams the auto-regressive expansion of tokenized text).
- This precludes training the LLM on a reasonable sized dataset ... unless we want to pay for a TB of RAM...
The feature:
Allow x=foo and y=bar to be passed or dataset=Dataset() # that pachages foo and bar as x and y ... and selectively run .fit based on the presence or absence of these mutex param configurations.