Marcus Svensson
Marcus Svensson
Thank you for the fast reply. Would you recommend using `'val_total_loss'` or `'factorized_top_k/top_100_categorical_accuracy'` to monitor during early stopping? My reasoning in a "regular" DNN model is to monitor validation loss...
I'm not sure why you would not use the query with exclusions feature, seems to fit your use case. However, if you for some reason cannot do that, one idea...
I tried to use Tensorflow's `Checkpoint` and `CheckpointManager` as described in the [documentation](https://www.tensorflow.org/guide/checkpoint) but unfortunately encounters a similar Out-Of-Memory error: ``` (4) Resource exhausted: OOM when allocating tensor with shape[3708977407]...
Any idea of how this would be done running the pipeline on Vertex AI? Can you access the state of the component/pipeline run somehow?
> @HaiyiMei's workaround from [here](https://github.com/tiangolo/typer/issues/182#issuecomment-1708245110) seems to actually work quite well for this. > > ``` > import typer > import click > > import enum > from typing import...