farrokhsiar
farrokhsiar
Thanks, @danielcohenlive. Did you instantiate the same way I described? and if so, could you share the version of AX you used to reproduce the problem?
Yes! I ran and got the same results!
I think I found the exact problem. I am using the AX in the Service mode with SQL backend. On the experiment retrieval, I use: `ax_client = AxClient(db_settings=self.db_setting) ax_client.load_experiment_from_database(experiment_name=name)` The...
To help with reproducing the issue, my current workflow is: 1. Create AxClient with a specific generation strategy with Postg storage 2. Retrieve the Client from DB 3. Perform a...
That is my understanding! But even after gradient clipping, the problem persisted. Like when I try to predict the next token as: 12179 4675 11374 1807, ..., after one step...
I switched to another encoder, cause I couldn't get the problem solved.
I am facing the same problem. It works with batch size=1 on cuda, but anything larger returns zero. Any solution on this one?